python - 用装饰器替换宏样式的类方法？

尽管阅读了很多关于该主题的文章(包括 [this][1] 关于 SO 的非常受欢迎的文章)，但我仍然很难很好地掌握装饰器。我怀疑我一定是愚蠢的，但鉴于愚蠢带来的所有固执，我决定尝试解决这个问题。

那个，我怀疑我有一个很好的用例...

下面是我的一个从 PDF 文件中提取文本的项目中的一些代码。处理包括三个步骤:

设置处理 PDF 文件所需的 PDFMiner 对象(样板初始化)。
对 PDF 文件应用处理函数。
无论发生什么，关闭文件。

我最近了解了上下文管理器和 with 语句，这对他们来说似乎是一个很好的用例。因此，我首先定义了 PDFMinerWrapper 类:

class PDFMinerWrapper(object):
    '''
    Usage:
    with PDFWrapper('/path/to/file.pdf') as doc:
        doc.dosomething()
    '''
    def __init__(self, pdf_doc, pdf_pwd=''):
        self.pdf_doc = pdf_doc
        self.pdf_pwd = pdf_pwd

    def __enter__(self):
        self.pdf = open(self.pdf_doc, 'rb')
        parser = PDFParser(self.pdf)  # create a parser object associated with the file object
        doc = PDFDocument()  # create a PDFDocument object that stores the document structure
        parser.set_document(doc)  # connect the parser and document objects
        doc.set_parser(parser)
        doc.initialize(self.pdf_pwd)  # pass '' if no password required
        return doc

    def __exit__(self, type, value, traceback):
        self.pdf.close()
        # if we have an error, catch it, log it, and return the info
        if isinstance(value, Exception):
            self.logError()
            print traceback
            return value

现在我可以轻松地处理 PDF 文件并确保它能够优雅地处理错误。理论上，我需要做的就是这样:

with PDFMinerWrapper('/path/to/pdf') as doc:
    foo(doc)

这很好，除了我需要检查 PDF 文档是否可提取之前将函数应用于 PDFMinerWrapper 返回的对象。我当前的解决方案涉及一个中间步骤。

我正在使用一个名为 Pamplemousse 的类，它用作处理 PDF 的接口(interface)。反过来，每次必须对对象链接到的文件执行操作时，它都会使用 PDFMinerWrapper。

下面是一些(删节的)代码，演示了它的用法:

class Pamplemousse(object):
    def __init__(self, inputfile, passwd='', enc='utf-8'):
        self.pdf_doc = inputfile
        self.passwd = passwd
        self.enc = enc

    def with_pdf(self, fn, *args):
        result = None
        with PDFMinerWrapper(self.pdf_doc, self.passwd) as doc:
            if doc.is_extractable:  # This is the test I need to perform
                # apply function and return result
                result = fn(doc, *args)

        return result

    def _parse_toc(self, doc):
        toc = []
        try:
            toc = [(level, title) for level, title, dest, a, se in doc.get_outlines()]
        except PDFNoOutlines:
            pass
        return toc

    def get_toc(self):
        return self.with_pdf(self._parse_toc)

每当我希望对 PDF 文件执行操作时，我都会将相关函数及其参数传递给 with_pdf 方法。反过来，with_pdf 方法使用 with 语句来利用 PDFMinerWrapper 的上下文管理器(从而确保优雅地处理异常)并执行在实际应用已传递的功能之前进行检查。

我的问题如下:

我想简化此代码，这样我就不必显式调用 Pamplemousse.with_pdf。我的理解是装饰器在这里可以提供帮助，所以:

我将如何实现一个装饰器，其工作是调用 with 语句并执行可提取性检查？
装饰器是否可以是类方法，或者我的装饰器是否必须是自由形式的函数或类？

最佳答案

我解释您的目标的方式是能够在您的 Pamplemousse 类上定义多个方法，而不必经常将它们包装在该调用中。这是一个真正简化的版本:

def if_extractable(fn):
    # this expects to be wrapping a Pamplemousse object
    def wrapped(self, *args):
        print "wrapper(): Calling %s with" % fn, args
        result = None
        with PDFMinerWrapper(self.pdf_doc) as doc:
            if doc.is_extractable:
                result = fn(self, doc, *args)
        return result
    return wrapped


class Pamplemousse(object):

    def __init__(self, inputfile):
        self.pdf_doc = inputfile

    # get_toc will only get called if the wrapper check
    # passes the extractable test
    @if_extractable
    def get_toc(self, doc, *args):
        print "get_toc():", self, doc, args

装饰器 if_extractable 被定义为只是一个函数，但它期望在您的类的实例方法上使用。

修饰的 get_toc 用于委托(delegate)给一个私有(private)方法，如果它通过了检查，它只是期望接收一个 doc 对象和参数。否则它不会被调用并且包装器返回 None。

有了这个，你可以继续定义你的操作函数以期待一个doc

您甚至可以添加一些类型检查以确保它包装了预期的类:

def if_extractable(fn):
    def wrapped(self, *args):
    if not hasattr(self, 'pdf_doc'):
        raise TypeError('if_extractable() is wrapping '\
                        'a non-Pamplemousse object')
    ...

关于python - 用装饰器替换宏样式的类方法？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11659988/

python - 用装饰器替换宏样式的类方法？

上一篇：python - Django - 查询集校验和

下一篇：python - 使用 python 向 docx 文件添加 header