我在 Linux 上运行 python 时遇到问题。我正在尝试学习 python,并想尝试解析一个小的 XML 文件并将标签和数据放入列表中。但每次运行代码时,列表中的每个元素都会附加一个“u”。
[u'world']
defaultdict(<type 'list'>, {u'world': [u'data']})
我的代码如下:
import xml.sax
from collections import defaultdict
class TransformXML(xml.sax.ContentHandler):
def __init__ (self):
self.start_tag_name = -1
self.tag_data = -1
self.myDict = defaultdict(list)
self.tags = []
def startElement(self, name, attrs):
self.start_tag_name = name
print name
print self.start_tag_name
def characters(self, content):
if content.strip(' \r\n\t') != "":
self.tag_data = content.strip(' \r\n\t')
print self.start_tag_name
self.tags.append(self.start_tag_name)
self.myDict[self.start_tag_name].append(content.strip(' \r\n\t'))
def endElement(self, name):
pass
def __del__ (self):
if self.myDict:
del self.myDict
print "deleteing myDict"
有人知道问题出在哪里吗?
最佳答案
这个“奇怪”的符号基本上意味着字符串
或字符
是用unicode编码的
例如。如果我有一个字符串 Test
:
>>> unicode('Test')
u'Test'
>>> s = unicode('Test')
>>> type(s)
<type 'unicode'>
文档 here
总而言之,根据 python
文档,
...a Unicode string is a sequence of code points, which are numbers from 0 to 0x10ffff. This sequence needs to be represented as a set of bytes (meaning, values from 0-255) in memory. The rules for translating a Unicode string into a sequence of bytes are called an encoding.
关于python - 奇数字符追加到Python列表的前面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15846582/