Windows 命令行/shell - 丢弃 UTF-8 BOM

这个问题是another question about selectively appending lines from one file to another的延续.

我使用的正则表达式可以很好地匹配要保留/丢弃的行。问题是该文件是由一堆其他文件组成的，有时我想保留的那一行从 UTF-8 编码文件的第一行开始。这意味着 findstr 命令返回如下内容:

∩╗┐LineToKeep that started out as the first line in its file
LineToKeep another
LineToKeep more lines
∩╗┐LineToKeep that started out as the first line in its file
LineToKeep more

保证除 BOM 字节外，该行始终以“LineToKeep”开头。我怎样才能摆脱这三个 UTF-8 BOM 字节，因为这些 Windows shell 命令无法正确处理它们？

我希望有一种方法可以将它们原地删除，或者可以对上一个问题中的 findstr 命令进行修改。

因为我知道每一行都必须以“LineToKeep”或“∩╗┐LineToKeep”开头，所以我认为有一种方法可以计算类似 if (Line[3:10] == "LineToKeep") { Line = 线 [3:]; 每行。

最佳答案

unix world 的另一种选择就地删除文件中的 BOM:

sed -zbi "1s/^\xEF\xBB\xBF//" filepath

这需要下载sed 4.4 for windows来自 https://github.com/mbuilov/sed-windows提供工作-z and -b options防止corruption of line endings .

关于Windows 命令行/shell - 丢弃 UTF-8 BOM，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12851027/

上一篇：使用 P2P 的 C++ 和 voIP

下一篇：windows - 如何在 Microsoft Visual Studio 2010 上配置 Glib？

batch-file - 报告删除DEL批处理文件中的文件失败

batch-file - 系统找不到指定的路径

windows - 带有任务调度程序的控制台应用程序

windows - 与Linux等效的Windows:查找-name

java - 从命令行运行 Netbeans maven 项目？

linux - Lynx:当用户名包含域时如何使用 -auth 标志？

Python 批量追加到特定类型的文件 - 在单个目录中

导致 Apache 无限期挂起的 PHP session

command-line - 将图像序列转换为具有透明度的视频