我只是好奇本地应用程序(包括浏览器)如何读取/解释 mime 类型。用于读取 MIME 类型的插件是否构建到每个应用程序中,或者在解释 MIME 类型时应用程序在操作系统中是否有一个特殊的系统文件夹?
RFC在定义什么是 MIME 类型时使用字符表作为引用:
(1) textual message bodies in character sets other than US-ASCII
虽然MDN让它听起来像是使用 content-type
你会在 HTML 之类的东西中找到
content-type=image/jpeg
或 content-type=application/javascript
是否使用 UTF-8 字符表来确定它们的字符集(字形)而其他的东西在逻辑上确定应该将这些字符符号解释成什么?
或者这是否意味着每个内容类型都有自己的特殊字符图(如 utf-8 -> js-8????),它既进行字符的字形转换,又将 char 字形逻辑解释为二进制?
为什么 charcharts 和 content-type 听起来都是 MIME 的意思? 包含内容类型图表/解释逻辑的 Mac 和 Linux 系统的文件夹路径在哪里?
最佳答案
在 macOS 上,您可以使用 file --mime "/path/to/filename"
来报告文件的 mime 类型。
file
的手册页(参见 here)阐明了在 mime 类型查找之前发生的事情:
file tests each argument in an attempt to classify it. There are three
sets of tests, performed in this order: filesystem tests, magic tests,
and language tests. The first test that succeeds causes the file type to
be printed.
The filesystem tests are based on examining the return from a stat(2)
system call. The program checks to see if the file is empty, or if it's
some sort of special file. Any known file types appropriate to the sys-
tem you are running on (sockets, symbolic links, or named pipes (FIFOs)
on those systems that implement them) are intuited if they are defined in
the system header file <sys/stat.h>.
The magic tests are used to check for files with data in particular fixed
formats. The canonical example of this is a binary executable (compiled
program) a.out file, whose format is defined in <elf.h>, <a.out.h> and
possibly <exec.h> in the standard include directory. These files have a
``magic number'' stored in a particular place near the beginning of the
file that tells the UNIX operating system that the file is a binary exe-
cutable, and which of several types thereof. The concept of a ``magic''
has been applied by extension to data files. Any file with some invari-
ant identifier at a small fixed offset into the file can usually be
described in this way. The information identifying these files is read
from the compiled magic file /usr/share/file/magic.mgc, or the files in
the directory /usr/share/file/magic if the compiled file does not exist.
If a file does not match any of the entries in the magic file, it is
examined to see if it seems to be a text file. ASCII, ISO-8859-x, non-
ISO 8-bit extended-ASCII character sets (such as those used on Macintosh
and IBM PC systems), UTF-8-encoded Unicode, UTF-16-encoded Unicode, and
EBCDIC character sets can be distinguished by the different ranges and
sequences of bytes that constitute printable text in each set. If a file
passes any of these tests, its character set is reported. ASCII,
ISO-8859-x, UTF-8, and extended-ASCII files are identified as ``text''
because they will be mostly readable on nearly any terminal; UTF-16 and
EBCDIC are only ``character data'' because, while they contain text, it
is text that will require translation before it can be read. In addi-
tion, file will attempt to determine other characteristics of text-type
files. If the lines of a file are terminated by CR, CRLF, or NEL,
instead of the Unix-standard LF, this will be reported. Files that con-
tain embedded escape sequences or overstriking will also be identified.
Once file has determined the character set used in a text-type file, it
will attempt to determine in what language the file is written. The lan-
guage tests look for particular strings (cf. <names.h>) that can appear
anywhere in the first few blocks of a file. For example, the keyword .br
indicates that the file is most likely a troff(1) input file, just as the
keyword struct indicates a C program. These tests are less reliable than
the previous two groups, so they are performed last. The language test
routines also test for some miscellany (such as tar(1) archives).
Any file that cannot be identified as having been written in any of the
character sets listed above is simply said to be ``data''.
关于linux - mac 和 linux 中的 mime 插件文件的位置在哪里?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46217787/