python - 从 python2 转换为 python3 时处理 encode()

我正在将一个大型项目从 python2 转换为 python3(不需要 python2 向后兼容性)。

在测试转换时，我发现我遇到了一个问题，即某些字符串被转换为 bytes 对象，这引起了麻烦。我将其追溯到以下方法，该方法在许多地方被调用:

def custom_format(val):
    return val.encode('utf8').strip().upper()

在 python2 中:

custom_format(u'\xa0')
# '\xc2\xa0'
custom_format('bar')
# `BAR`

在 python3 中:

custom_format('\xa0')
# b'\xc2\xa0'
custom_format('bar')
# b`BAR`

这是一个问题的原因是因为在某些时候 custom_format 的输出是要使用 format()< 插入到 SQL 模板字符串中，但是 'foo = {}'.format(b'bar') == "foo = b'BAR'"，这会搞乱潜在的 SQL 语法。

简单地删除 encode('utf8') 部分将确保 custom_format('bar') 正确返回 'BAR'，但是现在 custom_format('\xa0') 返回 '\xa0' 而不是 python2< 的 '\xc2\xa0'/ 版本。 (虽然我对 unicode 的了解还不够多，不知道这是不是坏事)

在不弄乱代码的 SQL 或 format() 部分的情况下，我如何才能确保 python2 版本的预期行为在python3版本中展示？是像删除 encode('utf8') 一样简单还是会导致意外冲突？

最佳答案

如果您的目的是确保所有传入的字符串，无论是 str 还是 bytes，都转换为 bytes，那么您必须保留 encode 因为 Python3 使用 str 而不是 bytes (Python2 就是这种情况)作为 native 字符串类型。 encode 将 str 转换为 bytes。

如果您的意图是确保查询看起来正确。然后你可以删除 encode 并让 Python3 为你处理事情。

关于python - 从 python2 转换为 python3 时处理 encode()，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54153986/

python - 从 python2 转换为 python3 时处理 encode()

上一篇：python - 如何创建 Django OneToMany 关系？

下一篇：python - 使用求和函数在列表中添加对象时出现不支持的操作数类型错误