python - python 2.7 中的 pickle.dump 不保存类变量字典

标签 python python-2.7 pickle

我正在使用 pickle 通过转储根来保存对象图。当我加载根时,它具有所有实例变量和连接的对象节点。但是,我将所有节点保存在字典类型的类变量中。类变量在保存之前是满的,但在我解开数据之后它是空的。

这是我正在使用的类:

class Page():

    __crawled = {}

    def __init__(self, title = '', link = '', relatedURLs = []):
        self.__title = title
        self.__link = link
        self.__relatedURLs = relatedURLs
        self.__related = [] 

    @property
    def relatedURLs(self):
        return self.__relatedURLs

    @property
    def title(self):
        return self.__title

    @property
    def related(self):
        return self.__related

    @property
    def crawled(self):
        return self.__crawled

    def crawl(self,url):
        if url not in self.__crawled:
            webpage = urlopen(url).read()
            patFinderTitle = re.compile('<title>(.*)</title>')
            patFinderLink = re.compile('<link rel="canonical" href="([^"]*)" />')
            patFinderRelated = re.compile('<li><a href="([^"]*)"')

            findPatTitle = re.findall(patFinderTitle, webpage)
            findPatLink = re.findall(patFinderLink, webpage)
            findPatRelated = re.findall(patFinderRelated, webpage)
            newPage = Page(findPatTitle,findPatLink,findPatRelated)
            self.__related.append(newPage)
            self.__crawled[url] = newPage
        else:
            self.__related.append(self.__crawled[url])

    def crawlRelated(self):
        for link in self.__relatedURLs:
            self.crawl(link)

我是这样保存的:

with open('medTwiceGraph.dat','w') as outf:
    pickle.dump(root,outf)

然后我这样加载它:

def loadGraph(filename): #returns root
    with open(filename,'r') as inf:
        return pickle.load(inf)

root = loadGraph('medTwiceGraph.dat')

加载除类变量 __crawled 之外的所有数据。

我做错了什么?

最佳答案

Python 并不真正 pickle 类对象。它只是保存他们的名字和在哪里可以找到他们。来自 pickle 的文档:

Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply. Note that none of the class’s code or data is pickled, so in the following example the class attribute attr is not restored in the unpickling environment:

class Foo:
    attr = 'a class attr'

picklestring = pickle.dumps(Foo)

These restrictions are why picklable functions and classes must be defined in the top level of a module.

Similarly, when class instances are pickled, their class’s code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class. If you plan to have long-lived objects that will see many versions of a class, it may be worthwhile to put a version number in the objects so that suitable conversions can be made by the class’s __setstate__() method.

在您的示例中,您可以解决将 __crawled 更改为实例属性或全局变量的问题。

关于python - python 2.7 中的 pickle.dump 不保存类变量字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16637464/

相关文章:

python - 使用 for 循环在列表中添加值

python - img不是数字元组

python - 导入错误 : cannot import name HTTPSHandler installing get-pip.

python - 如何判断服务器断线(以太网线被切断的情况)

python - 如何将 pickle 数据上传到 django FileField?

python - 为什么我在读取空文件时得到 "Pickle - EOFError: Ran out of input"?

python - pandas如何同时计算 bool 列值和其他列的不同计数

python - if(interactive()) 是等同于 pythonic “if __name__ == ” __main_ _“: main()” 的 R 吗?

python - glob.iglob 在所有子目录中查找所有 .txt 文件会产生错误

python - 如何将Python字典写入文件而不使其成为字符串?