Python TypeError : expected a character buffer object, 个人理解错误

标签 python unicode

我很长一段时间都被这个错误困住了:

 TypeError: expected a character buffer object

我只是明白我误解了什么,这是关于 unicode 字符串和“简单”字符串之间的区别,我试图将上面的代码与“普通”字符串一起使用,而我必须传递一个 unicode一。所以在字符串中断执行之前忘记了简单的“u”:/!!!

顺便说一句,TypeError 对我来说非常不清楚,现在仍然如此。

请解释一下我遗漏了什么以及为什么“简单”字符串不是“字符缓冲区对象”?

您可以使用以下代码进行复制(从 here 中提取和 (c) : )

def maketransU(s1, s2, todel=u""):
    """Build translation table for use with unicode.translate().

    :param s1: string of characters to replace.
    :type s1: unicode
    :param s2: string of replacement characters (same order as in s1).
    :type s2: unicode
    :param todel: string of characters to remove.
    :type todel: unicode
    :return: translation table with character code -> character code.
    :rtype: dict
    """
    # We go unicode internally - ensure callers are ok with that.
    assert (isinstance(s1,unicode))
    assert (isinstance(s2,unicode))
    trans_tab = dict( zip( map(ord, s1), map(ord, s2) ) )
    trans_tab.update( (ord(c),None) for c in todel )
    return trans_tab

#BlankToSpace_table = string.maketrans (u"\r\n\t\v\f",u"     ")
BlankToSpace_table = maketransU (u"\r\n\t\v\f",u"     ")
def BlankToSpace(text) :
    """Replace blanks characters by realspaces.

    May be good to prepare for regular expressions & Co based on whitespaces.

    :param  text: the text to clean from blanks.
    :type  text: string
    :return: List of parts in their apparition order.
    :rtype: [ string ]
    """
    print text, type(text), len(text)
    try:
        out =  text.translate(BlankToSpace_table)
    except TypeError, e:
        raise
    return out

# for SO : the code below is just to reproduce what i did not understand
dummy = "Hello,\n, this is a \t dummy test!"
for s in (unicode(dummy), dummy):
    print repr(s)
    print repr(BlankToSpace(s))

制作:

u'Hello,\n, this is a \t dummy test!'
Hello,
, this is a      dummy test! <type 'unicode'> 32
u'Hello, , this is a   dummy test!'
'Hello,\n, this is a \t dummy test!'
Hello,
, this is a      dummy test! <type 'str'> 32

Traceback (most recent call last):
  File "C:/treetaggerwrapper.error.py", line 44, in <module>
    print repr(BlankToSpace(s))
  File "C:/treetaggerwrapper.error.py", line 36, in BlankToSpace
    out =  text.translate(BlankToSpace_table)
TypeError: expected a character buffer object

最佳答案

问题在于字节串的translate 方法与unicode 字符串的translate 方法不同。这是非 unicode 版本的文档字符串:

S.translate(table [,deletechars]) -> string

Return a copy of the string S, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table, which must be a string of length 256.

这是 unicode 版本:

S.translate(table) -> unicode

Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.

您可以看到非 unicode 版本需要“长度为 256 的字符串”,而非 unicode 版本需要“映射”(即字典)。所以问题不在于你的 unicode 字符串是一个缓冲区对象而非 unicode 字符串不是 - 当然,两者都是缓冲区 - 但那个 translate 方法需要这样一个缓冲区对象和其他不是。

关于Python TypeError : expected a character buffer object, 个人理解错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10385419/

相关文章:

python - ( python ) : DataFrames add a total row that is the sum of only one column

mysql - 以某些语言将文本存储在数据库中

unicode - Prolog 中基本多语言平面 (BMP) 之外的转义字符

c++ - 在 C++ 中什么时候使用 WCHAR 什么时候使用 CHAR

vba - 在 VBA 中使用 Unicode 文件名(使用 Dir、FileSystemObject 等)

python - 使用 Python tkinter 重新缩放小部件时,如何使用小部件的其余部分缩放文本?

Python - 返回类变量的最佳实践是什么?

python - 在 python 中解析 xbrl 文件

python - 使用 Python 查询 MySQL 数据库

c++ - 上标的 Unicode 字符显示方框 : ࠚ