c# - 为什么 File.ReadAllBytes 结果与使用 File.ReadAllText 时不同？

我有一个内容为“test”的文本文件(UTF-8 编码)。我尝试从此文件中获取字节数组并将其转换为字符串，但它包含一个奇怪的字符。我使用以下代码:

var path = @"C:\Users\Tester\Desktop\test\test.txt"; // UTF-8

var bytes = File.ReadAllBytes(path);
var contents1 = Encoding.UTF8.GetString(bytes);

var contents2 = File.ReadAllText(path);

Console.WriteLine(contents1); // result is "?test"
Console.WriteLine(contents2); // result is "test"

conents1 与 contents2 不同 - 为什么？

最佳答案

如 ReadAllText's documentation 中所述:

This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. Encoding formats UTF-8 and UTF-32 (both big-endian and little-endian) can be detected.

所以文件包含 BOM ( Byte order mark )，ReadAllText 方法正确地解释了它，而第一个方法只是读取普通字节，根本没有解释它们。

Encoding.GetString说它只是:

decodes all the bytes in the specified byte array into a string

(强调我的)。这当然不完全是决定性的，但你的例子表明这是从字面上理解的。

关于c# - 为什么 File.ReadAllBytes 结果与使用 File.ReadAllText 时不同？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26101859/

上一篇：c# - 是否定义了 C#/.NET 有符号整数溢出行为？

下一篇：c# - Url.Link 在 Web Api 2 中抛出 Not Implemented 异常

相关文章：

c# - 如何使用 LEADTOOLS 编辑 DICOM 文件的患者数据

java - 如何获取字符串中正则表达式的第一个匹配项？

c++ - 如何在 C++17 中将 std::string 转换为 std::vector<std::byte>？

c# - 访问 TFS : API or SDK?

c# - 收到错误 400/404 - HttpUtility.UrlEncode 未对完整字符串进行编码？

ruby - 如何制作 Ruby 1.8 小写非拉丁字符？

python - PyQt4 字符串中与号 (&) 的用途是什么？

java - 将 tdata 帧发送到套接字

java - 在java中将位串转换为字节

c# Windows form应用程序窗体问题