Python - 文件或文件夹内容版本控制

当作者写书时，我们使用 CKEditor 生成 HTML 内容。我们使用 python-django 将该内容存储到磁盘上的单独 HTML 文件中。

但是现在，我们收到了客户的要求，显示文件的历史记录/修订版本(每当作者按下 ctrl+s 时，侧边栏中的时间戳列表)，就像 Eclipse 所做的那样:

我计划通过取 2 个不同时间存储的 html 文本的交集来使用 diff。

但我不知道如何区分图像、音频和视频。

知道 git、eclipse 或 Vesrsion 控制系统是如何做到这一点的吗？他们是否使用任何类型的编码(例如 SHA)将其存储在磁盘上？

请建议我是否可以使用任何其他方法来执行此操作。

如果有可用的开源 python 库，我已经准备好使用。我用谷歌搜索但没有运气。

最佳答案

试试这个(我为你写了一个类(class)):

import os
import time
import hashlib


class SimpleFileCheckSum(object):

    def __init__(self, path):

        self.path = path
        self.output = {}

    def check_path_error(self):

        if os.path.exists(self.path) is True and os.path.isfile(self.path):
            return True
        else:
            return False

    def get_file_size(self):

        try:
            st_data = os.stat(self.path)
            get_size = str(st_data.st_size)

        except PermissionError:

            try:

                os_size_data = str(os.path.getsize(self.path))
                self.output["SIZE"] = os_size_data

            except:
                self.output["SIZE"] = "Error"

        else:
            self.output["SIZE"] = get_size

    def get_file_times(self):

        def convert_time_to_human_readable(get_time):

            return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(get_time))

        try:

            my_st_object = os.stat(self.path)

            file_creation_time = convert_time_to_human_readable(my_st_object.st_ctime)
            file_last_modified_time = convert_time_to_human_readable(my_st_object.st_mtime)

        except:
            self.output['TIMES'] = {"CREATION": "Error", "MODIFIED": "Error"}

        else:
            self.output['TIMES'] = {"CREATION": file_creation_time, "MODIFIED": file_last_modified_time}

    def get_file_full_path(self):

        try:

            get_full_path = os.path._getfullpathname(self.path)
            get_final_path = os.path._getfinalpathname(self.path)

        except:
            self.output['PATH'] = {"FULL": "Error", "FINAL": "Error"}

        else:
            self.output['PATH'] = {"FULL": get_full_path, "FINAL": get_final_path}

    def get_file_hashes(self):

        try:

            hash_md5 = hashlib.md5()
            hash_sha1 = hashlib.sha1()
            hash_sha256 = hashlib.sha256()
            hash_sha512 = hashlib.sha512()

            with open(self.path, "rb") as f:
                for chunk in iter(lambda: f.read(2 ** 20), b""):
                    hash_md5.update(chunk)
                    hash_sha1.update(chunk)
                    hash_sha256.update(chunk)
                    hash_sha512.update(chunk)

        except:
            self.output["HASH"] = {"MD5": "Error", "SHA1": "Error", "SHA256": "Error", "SHA512": "Error"}

        else:
            self.output["HASH"] = {"MD5": hash_md5.hexdigest(), "SHA1": hash_sha1.hexdigest(),
                                   "SHA256": hash_sha256.hexdigest(), "SHA512": hash_sha512.hexdigest()}

    def call_all(self):

        if self.check_path_error() is True:

            self.get_file_full_path()
            self.get_file_size()
            self.get_file_times()
            self.get_file_hashes()

            return self.output

        else:
            return {"Error": "Your Path is Not Valid"}


if __name__ == '__main__':

    file_info = SimpleFileCheckSum("Your_file_address")
    get_last_data = file_info.call_all()

    print("Your Raw Dict Output : ", get_last_data, "\n\n")

注意:以便您可以询问；如果我有我的文件地址，为什么我需要 get_file_full_path() 子函数？...因为你可以将动态地址放入此类，如“./myfile”，并且 get_file_full_path() 将返回其完整和最终地址。

关于Python - 文件或文件夹内容版本控制，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47057535/

Python - 文件或文件夹内容版本控制

上一篇：python - 通过 ElementTree 附加和格式化新的子元素

下一篇：python - Django- Graphite 烯 : how to filter with an OR operator