我有以下问题:我通过 SSZipArchive(在 Swift 应用程序中)提取了一个 zip 文件,并且有一些文件名带有“无效”字符。
我认为原因是我在 Windows 下压缩了文件,所以名称现在用 ANSI 编码。
有没有办法在解压缩过程中转换所有“损坏”的文件夹和文件名?
还是以后?如果我必须遍历文件夹树并重命名文件,那也没问题。
但是我不知道如何找出在 ANSI 中设置了哪些名称,我也不知道如何更正字符集。
最佳答案
official spec说路径应该用代码页 437 MS-DOS Latin US 或 UTF-8 编码(如果设置了通用字段的第 11 位):
D.1 The ZIP format has historically supported only the original IBM PC character encoding set, commonly referred to as IBM Code Page 437. This limits storing file name characters to only those within the original MS-DOS range of values and does not properly support file names in other character encodings, or languages. To address this limitation, this specification will support the following change.
D.2 If general purpose bit 11 is unset, the file name and comment should conform to the original ZIP character encoding. If general purpose bit 11 is set, the filename and comment must support The Unicode Standard, Version 4.1.0 or greater using the character encoding form defined by the UTF-8 storage specification. The Unicode Standard is published by the The Unicode Consortium (www.unicode.org). UTF-8 encoded data stored within ZIP files is expected to not include a byte order mark (BOM).
我最近发布了一个名为 ZIPFoundation 的 ZIP 文件格式的 Swift 开源实现。 .它符合标准,应该能够检测 Windows 路径名并正确解码。
关于swift - 解压后文件名字符集错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42206354/