linux - git 可以使用基于补丁/差异的存储吗?

标签 linux git

据我了解,git 存储每个提交的修订的完整文件。即使它是压缩的,也没有办法与将压缩补丁存储在一个原始修订完整文件中进行竞争。对于图像等可压缩性差的二进制文件来说,这尤其是一个问题。

有没有办法让 git 使用基于补丁/差异的后端来存储修订?

我明白为什么 git 的主要用例会按照它的方式执行,但我有一个特殊的用例,如果可以的话我想使用 git,但它会占用太多空间。

谢谢

最佳答案

Git 确实使用基于差异的存储,以“delta compression”的名义默默地自动进行。它仅适用于“打包”的文件,并且打包不会在每次操作后发生。

  • git-repack文档:

    A pack is a collection of objects, individually compressed, with delta compression applied, stored in a single file, with an associated index file.

  • Git Internals - Packfiles :

    You have two nearly identical 22K objects on your disk. Wouldn’t it be nice if Git could store one of them in full but then the second object only as the delta between it and the first?

    It turns out that it can. The initial format in which Git saves objects on disk is called a “loose” object format. However, occasionally Git packs up several of these objects into a single binary file called a “packfile” in order to save space and be more efficient. Git does this if you have too many loose objects around, if you run the git gc command manually, or if you push to a remote server.

    后来:

    The really nice thing about this is that it can be repacked at any time. Git will occasionally repack your database automatically, always trying to save more space, but you can also manually repack at any time by running git gc by hand.

  • "The woes of git gc --aggressive" (Dan Farina),描述了增量压缩是对象存储的副产品,而不是修订历史:

    Git does not use your standard per-file/per-commit forward and/or backward delta chains to derive files. Instead, it is legal to use any other stored version to derive another version. Contrast this to most version control systems where the only option is simply to compute the delta against the last version. The latter approach is so common probably because of a systematic tendency to couple the deltas to the revision history. In Git the development history is not in any way tied to these deltas (which are arranged to minimize space usage) and the history is instead imposed at a higher level of abstraction.

    后来,引用 Linus 的话,关于 git gc --aggressive 丢弃旧的好增量并用更差的增量替换它们的趋势:

    So the equivalent of "git gc --aggressive" - but done properly - is to do (overnight) something like

    git repack -a -d --depth=250 --window=250
    

关于linux - git 可以使用基于补丁/差异的存储吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32077311/

相关文章:

linux - 无法使用 bash 脚本附加到文件

git - Tomcat/Hudson 无法连接到 Github

git - 如何更改我连接的当前存储库?

git - 如何在 visual studio 中设置输出和中间目录以依赖于当前的 git 分支?

java - 如何查找位于 linux box 上的文件的大小并在 java webapp 中显示

linux - 字符串比较不正常

java - 向其他人授予对生成的 HeapDumpOnOutOfMemoryError 文件 .hprof 的读取权限

linux - 将 laravel 项目部署到服务器?

python - 为什么从命令行调用的脚本与从 git 属性调用的脚本的行为不同?

git - 无法访问远程存储库以导入 Go 模块