python - unicode_literals 是做什么用的？

我在 Python 中遇到了一个关于 __future__.unicode_literals 的奇怪问题。不导入 unicode_literals 我得到正确的输出:

# encoding: utf-8
# from __future__ import unicode_literals
name = 'helló wörld from example'
print name

但是当我添加 unicode_literals 导入时:

# encoding: utf-8
from __future__ import unicode_literals
name = 'helló wörld from example'
print name

我收到了这个错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 4: ordinal not in range(128)

unicode_literals 是否将每个字符串编码为 utf-8？我应该怎么做才能覆盖这个错误？

最佳答案

您的终端或控制台未能让 Python 知道它支持 UTF-8。

如果没有 from __future__ import unicode_literals 行，您将构建一个包含 UTF-8 编码字节的字节字符串。使用该字符串，您正在构建一个 unicode 字符串。

print 必须区别对待这两个值；字节字符串被写入 sys.stdout 不变。 unicode 字符串首先被编码为字节，Python 为此咨询 sys.stdout.encoding。如果您的系统没有正确告诉 Python 它支持什么编解码器，则默认使用 ASCII。

您的系统未能告诉 Python 使用什么编解码器； sys.stdout.encoding 设置为 ASCII，将 unicode 值编码为打印失败。

您可以在打印时手动编码为 UTF-8 来验证这一点:

# encoding: utf-8
from __future__ import unicode_literals
name = 'helló wörld from example'
print name.encode('utf8')

您也可以通过创建没有 from __future__ 导入语句的 unicode 文字来重现该问题:

# encoding: utf-8
name = u'helló wörld from example'
print name

其中 u'..' 也是一个 unicode 文字。

如果没有详细说明您的环境是什么，就很难说解决方案是什么；这在很大程度上取决于所使用的操作系统和控制台或终端。

关于python - unicode_literals 是做什么用的？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/23370025/