我很长一段时间都被这个错误困住了:
TypeError: expected a character buffer object
我只是明白我误解了什么,这是关于 unicode 字符串和“简单”字符串之间的区别,我试图将上面的代码与“普通”字符串一起使用,而我必须传递一个 unicode一。所以在字符串中断执行之前忘记了简单的“u”:/!!!
顺便说一句,TypeError 对我来说非常不清楚,现在仍然如此。
请解释一下我遗漏了什么以及为什么“简单”字符串不是“字符缓冲区对象”?
您可以使用以下代码进行复制(从 here 中提取和 (c) : )
def maketransU(s1, s2, todel=u""):
"""Build translation table for use with unicode.translate().
:param s1: string of characters to replace.
:type s1: unicode
:param s2: string of replacement characters (same order as in s1).
:type s2: unicode
:param todel: string of characters to remove.
:type todel: unicode
:return: translation table with character code -> character code.
:rtype: dict
"""
# We go unicode internally - ensure callers are ok with that.
assert (isinstance(s1,unicode))
assert (isinstance(s2,unicode))
trans_tab = dict( zip( map(ord, s1), map(ord, s2) ) )
trans_tab.update( (ord(c),None) for c in todel )
return trans_tab
#BlankToSpace_table = string.maketrans (u"\r\n\t\v\f",u" ")
BlankToSpace_table = maketransU (u"\r\n\t\v\f",u" ")
def BlankToSpace(text) :
"""Replace blanks characters by realspaces.
May be good to prepare for regular expressions & Co based on whitespaces.
:param text: the text to clean from blanks.
:type text: string
:return: List of parts in their apparition order.
:rtype: [ string ]
"""
print text, type(text), len(text)
try:
out = text.translate(BlankToSpace_table)
except TypeError, e:
raise
return out
# for SO : the code below is just to reproduce what i did not understand
dummy = "Hello,\n, this is a \t dummy test!"
for s in (unicode(dummy), dummy):
print repr(s)
print repr(BlankToSpace(s))
制作:
u'Hello,\n, this is a \t dummy test!'
Hello,
, this is a dummy test! <type 'unicode'> 32
u'Hello, , this is a dummy test!'
'Hello,\n, this is a \t dummy test!'
Hello,
, this is a dummy test! <type 'str'> 32
Traceback (most recent call last):
File "C:/treetaggerwrapper.error.py", line 44, in <module>
print repr(BlankToSpace(s))
File "C:/treetaggerwrapper.error.py", line 36, in BlankToSpace
out = text.translate(BlankToSpace_table)
TypeError: expected a character buffer object
最佳答案
问题在于字节串的translate
方法与unicode 字符串的translate
方法不同。这是非 unicode 版本的文档字符串:
S.translate(table [,deletechars]) -> string
Return a copy of the string S, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256.
这是 unicode 版本:
S.translate(table) -> unicode
Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.
您可以看到非 unicode 版本需要“长度为 256 的字符串”,而非 unicode 版本需要“映射”(即字典)。所以问题不在于你的 unicode 字符串是一个缓冲区对象而非 unicode 字符串不是 - 当然,两者都是缓冲区 - 但那个 translate
方法需要这样一个缓冲区对象和其他不是。
关于Python TypeError : expected a character buffer object, 个人理解错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10385419/