python - 如何用西里尔语解码类似 JSON 的字符串？

我正在尝试在 Scrapy 中创建一个简单的蜘蛛，它将从网站获取所有广告。问题是所有广告都是西里尔文，所以我得到这样的字符串:

1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430

这是蜘蛛的代码:

def parse_advert(self, response):
    x = HtmlXPathSelector(response)

    advert = AdvertItem()

    advert['title'] = x.select("//h1/text()").extract()
    advert['phone'] = "111111111111"
    advert['text'] = "text text text text text text"
    filename = response.url.split("/")[-2]
    open(filename, 'wb').write(str(advert['title']))

有什么方法可以即时“翻译”该字符串吗？

谢谢。

最佳答案

使用str.decode('unicode-escape'):

>>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430'
1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430
>>> print r'1-\u043a\u043e\u043c\u043d\u0430\u0442\u043d\u0430\u044f \u043a\u0432\u0430\u0440\u0442\u0438\u0440\u0430'.decode('unicode-escape')
1-комнатная квартира

关于python - 如何用西里尔语解码类似 JSON 的字符串？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18118594/

上一篇：python - 计算用户输入的偶数 PYTHON 3

下一篇：python - 使用Python提取特定格式的括号

python - 索科，集体暂停

python - scrapy xpath选择器可以在浏览器中使用，但不能在crawl或shell中使用

JAVA Sqlite 西里尔字母显示为？

rust - 使用调试格式时如何编写西里尔文本？

python - 为什么我的模型在第二个时期过度拟合？

Python key 存在 : Key is a tuple of integer and string

python - 安装Scrapy for Python 2.6

python - 如何提取请求 url w.r.t.在scrapy中使用链接提取器时的响应url？