c# - 返回 PDF 页面的 SHA256 哈希值

我正在编写一个 C# WPF 应用程序，我在其中插入一个“标题”页面作为一批 PDF 文档的第一页。标题页取自批处理中第一个 pdf 的第一页。

用户会启动这个过程，但我想确保用户以后不能再次运行这个过程，否则会导致插入另一个 header 。

所以我的计划是获取页眉的 SHA256 哈希并将其与其他 pdf 第一页的哈希进行比较。如果匹配，则第一页与标题页相同，否则我们插入标题。

我敲了下面的代码来测试获取 pdf 中第一页的哈希值，但每次运行时哈希值都不一样。

为什么每次都不一样？

谢谢

using System.IO;
using System.Text;
using System.Security.Cryptography;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

namespace Syncada
{
    public class PDFDoc
    {

        private PdfDocument pdfDoc;

        public PDFDoc(string path)
        {
            pdfDoc = PdfReader.Open(path,PdfDocumentOpenMode.Import);
        }

        public string GetPageOneHash()
        {

            byte[] hash;

            PdfPage page = pdfDoc.Pages[0];
            using (MemoryStream stream = new MemoryStream())
            {
                PdfDocument doc = new PdfDocument();
                doc.AddPage(page);
                doc.Save(stream,false);

                SHA256 sha256 = SHA256.Create();
                hash = sha256.ComputeHash(stream);
            }

            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < hash.Length; i++)
            {
                sb.Append(hash[i].ToString("X2"));
            }
            return sb.ToString();
        }
    }
}

最佳答案

I knocked up the code below to test getting the hash of the first page in a pdf, but the hash is different every time it is run.

Why is it different every time?

您计算的不是页面的散列，而是您添加了相关页面的新 PDF 文档的散列。不幸的是，PDF 文档包含创建日期、最后修改日期和唯一 ID 等信息。由于每次计算哈希值时这些信息片段都不同，因此您永远不会得到相同的哈希值(除非发生冲突)。

关于c# - 返回 PDF 页面的 SHA256 哈希值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26266949/

c# - 返回 PDF 页面的 SHA256 哈希值

上一篇：c# - 在 C# 代码的开头和之后更改标签的文本

下一篇：c# - 动态设置 Html.LabelFor 的文本