pdf - 如何检查 TrueType 字体的 cmap 表和子表?

标签 pdf fonts truetype

PDF Reference说:

A TrueType font program’s built-in encoding maps directly from character codes to glyph descriptions, using an internal data structure called a “cmap”

它接着解释了 PDF 处理器的行为取决于字体文件中存在哪些 cmap 子表。

我正在尝试分析 .ttf使用fontforge提取的字体文件来自 LibreOffice 生成的 PDF。 PDF 使用单字节代码将此字体文件作为简单字体嵌入。当我查看.ttf时文件在 fontdrop.info ,它告诉我“glyphIndexMap”如下:

{"0":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":0,"19":0,"20":0,"21":0,"22":0,"23":0,"24":0,"25":0,"26":0,"27":0,"28":0,"29":0,"30":0,"31":0,"32":0,"33":0,"34":0,"35":0,"36":0,"37":0,"38":0,"39":0,"40":0,"41":0,"42":0,"43":0,"44":0,"45":0,"46":0,"47":0,"48":0,"49":0,"50":0,"51":0,"52":0,"53":0,"54":0,"55":0,"56":0,"57":0,"58":0,"59":0,"60":0,"61":0,"62":0,"63":0,"64":0,"65":0,"66":0,"67":0,"68":0,"69":0,"70":0,"71":0,"72":0,"73":0,"74":0,"75":0,"76":0,"77":0,"78":0,"79":0,"80":0,"81":0,"82":0,"83":0,"84":0,"85":0,"86":0,"87":0,"88":0,"89":0,"90":0,"91":0,"92":0,"93":0,"94":0,"95":0,"96":0,"97":0,"98":0,"99":0,"100":0,"101":0,"102":0,"103":0,"104":0,"105":0,"106":0,"107":0,"108":0,"109":0,"110":0,"111":0,"112":0,"113":0,"114":0,"115":0,"116":0,"117":0,"118":0,"119":0,"120":0,"121":0,"122":0,"123":0,"124":0,"125":0,"126":0,"127":0,"160":0,"161":0,"162":0,"163":0,"165":0,"167":0,"168":0,"169":0,"170":0,"171":0,"172":0,"174":0,"175":0,"176":0,"177":0,"180":0,"181":0,"182":0,"183":0,"184":0,"186":0,"187":0,"191":0,"192":0,"193":0,"194":0,"195":0,"196":0,"197":0,"198":0,"199":0,"200":0,"201":0,"202":0,"203":0,"204":0,"205":0,"206":0,"207":0,"209":0,"210":0,"211":0,"212":0,"213":0,"214":0,"216":0,"217":0,"218":0,"219":0,"220":0,"223":0,"224":0,"225":0,"226":0,"227":0,"228":0,"229":0,"230":0,"231":0,"232":0,"233":0,"234":0,"235":0,"236":0,"237":0,"238":0,"239":0,"241":0,"242":0,"243":0,"244":0,"245":0,"246":0,"247":0,"248":0,"249":0,"250":0,"251":0,"252":0,"255":0,"305":0,"338":0,"339":0,"376":0,"402":0,"675":3,"710":0,"711":0,"728":0,"729":0,"730":0,"731":0,"732":0,"733":0,"916":0,"937":0,"960":0,"8211":0,"8212":0,"8216":0,"8217":0,"8218":0,"8220":0,"8221":0,"8222":0,"8224":0,"8225":0,"8226":0,"8230":0,"8240":0,"8249":0,"8250":0,"8260":0,"8364":0,"8482":0,"8706":0,"8719":0,"8721":0,"8730":0,"8734":0,"8747":0,"8776":0,"8800":0,"8804":0,"8805":0,"9674":0,"57374":0,"64257":0,"64258":0}

(有趣的部分是"675":3)

我可以理解这一点,因为字体包含 4 个字形,索引 3 处的字形是 ʣ 字符(十进制 Unicode 点 675/U+02A3)。

但在 PDF 中,该字符在文本字符串中使用为 <01> ,并且没有给出其他编码 - 因此根据 PDF 引用,来自 <01> 的映射到索引 3 处的字形必须根据 .ttf 内的映射来完成文件:

If no Encoding entry is specified in the font dictionary, the “cmap” subtable with platform ID 1 and encoding 0 will be used to map directly from character codes to glyph descriptions, without any consideration of character names. This is the normal convention for symbolic fonts.

我已确认 PDF 中未指定任何编码条目。以下是使用 qpdf 提取的/Font 和/FontDescriptor 对象:

18 0 obj
<<
  /BaseFont /BAAAAA+LiberationSerif
  /FirstChar 0
  /FontDescriptor 20 0 R
  /LastChar 1
  /Subtype /TrueType
  /ToUnicode 21 0 R
  /Type /Font
  /Widths [
    777
    802
  ]
>>
endobj

20 0 obj
<<
  /Ascent 891
  /CapHeight 981
  /Descent -216
  /Flags 4
  /FontBBox [
    -543
    -303
    1277
    981
  ]
  /FontFile2 23 0 R
  /FontName /BAAAAA+LiberationSerif
  /ItalicAngle 0
  /StemV 80
  /Type /FontDescriptor
>>
endobj

那么我该如何调查 .ttf文件来确认“平台 ID 1 和编码 0 的“cmap”子表”已就位并包含我认为包含的映射?

编辑:the PDF in question

最佳答案

How do I inspect the cmap table and subtables in a TrueType font?

OT Master Light ,来自 Dutch Type Library,是一个免费工具,对于检查内部字体表非常方便。

关于pdf - 如何检查 TrueType 字体的 cmap 表和子表?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67946606/

相关文章:

css - 如何在 Google Fonts 中使用可变字体?

android - 适用于 Android 4+ 的 Roboto 字体?

c++ - SDL_FreeSurface 未释放 RAM/RAM 溢出

ios - 在 iOS 上创建 ttf 文件

css - 在我的网站中使用泰卢固语 ttf

php - 使用 libreoffice 使用 --headless --convert-to 标志将 .doc(x) 转换为 PDF/A-1a 格式

html - 将 PDF 的第一页显示为图像

css - Windows 上的网站字体参差不齐

c# - 将 iTextSharp PDF 作为内存流返回导致 StreamNotSupported

php - 如何在服务器上呈现网页(无 GUI)以供打印?