http - 什么时候应该在 HTTP URL 中对星号进行编码?

标签 http url url-encoding rfc

根据 RFC1738 , 星号 (*) “可以在 URL 中使用未编码的”:

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

然而,w3.org's Naming and Addressing material表示星号“保留用于在特定方案中具有特殊意义”并暗示它应该被编码。

此外,根据 RFC3986 , 一个 URL 就是一个 URI:

The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").

它还指定星号是“sub-delim”,它是“保留集”的一部分,并且:

URI producing applications should percent-encode data octets that correspond to characters in the reserved set unless these characters are specifically allowed by the URI scheme to represent data in that component.

它还明确指定它更新 RFC1738 .

我读到的所有这些都要求在 URL 中对星号进行编码,除非它们用于 URI 方案定义的特殊目的。

RFC1738 HTTP URI 方案的规范引用?它是否以某种方式免除了星号的编码,或者由于 RFC3986 在这方面它已经过时了? ?

Wikipedia说“[t] 字符在没有保留用途时不需要进行百分比编码。”是否RFC1738删除星号的保留用途?

各种资源和工具似乎在这个问题上存在分歧。

PHP 的 urlencoderawurlencode-- 后者 purports to follow RFC3986 -- do encode the asterisk .

但是,JavaScript 的escapeencodeURIComponent do not encode the asterisk .

和Java的URLEncoder does not encode the asterisk :

The special characters ".", "-", "*", and "_" remain the same.

人气online tools (a Google search for "online url encoder" 的前两个结果)也不对星号进行编码。 URL Encode and Decode Tool特别指出“[t] 保留字符必须仅在特定情况下进行编码。”它继续将星号和符号列为保留字符。它对 & 符号进行编码,但不对星号进行编码。

Stack Exchange 社区中的其他类似问题似乎有陈旧、不完整或没有说服力的答案:

考虑到所有这些,什么时候应该在 HTTP URL 中对星号进行编码?

最佳答案

##简答

URL 语法的当前定义表明您永远不需要对 URL 的路径、查询或片段组件中的星号字符进行百分号编码。


HTTP 1.1

正如@Riley Major 指出的那样,HTTP 1.1 引用的 URL 语法的 RFC 已被 RFC3986 废弃。 ,这不像最初引用的 RFC 那样关于星号的使用是非黑即白的。

RFC2396(2005 年 1 月之前的 URL 规范 - 原始答案)

星号永远不需要在 HTTP 1.1 URL 中编码,因为 *RFC2396 中被列为“未保留字符” ,用于定义 HTTP 1.1 中的 URI 语法。 path component of a URL 中允许使用非保留字符.

2.3. Unreserved Characters

Data characters that are allowed in a URI but do not have a reserved purpose are called unreserved. These include upper and lower case letters, decimal digits, and a limited set of punctuation marks and symbols.

   unreserved  = alphanum | mark

   mark        = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done unless the URI is being used in a context that does not allow the unescaped character to appear.

RFC3986(HTTP 的当前 URL 语法)

RFC3986 修改 RFC2396 使星号成为保留字符,原因是它“通常不安全解码”。我对此 RFC 的理解是,在 URL 的路径、查询和片段组件中允许使用未编码的星号字符,因为这些组件未将星号指定为分隔符 ( 2.2. Reserved Characters ):

These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax... If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

此外,3.3 Path确认保留字符的子集 (sub-delims) 可以在路径段中未编码使用(路径组件的部分由 / 分隔):

Aside from dot-segments ("." and "..") in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment. ... For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same.

HTTP 1.0

HTTP 1.0 引用 RFC1738定义 URL 语法,这通过一系列的更新和废弃意味着它使用与 HTTP 1.1 相同的 RFC 作为 URL 语法。

就向后兼容性而言,RFC1738 将星号指定为保留字符,尽管 HTTP 1.0 实际上并未为 URL 路径部分中的未编码星号定义任何特殊含义,如果你用一个。这应该意味着您仍然可以安全地将星号放在指向最旧系统的 URL 中。


作为旁注,星号字符在 Request-URI 中确实具有特殊含义在两个 HTTP 规范中,但不可能用 HTTP URL 表示它:

The asterisk "*" means that the request does not apply to a particular resource, but to the server itself, and is only allowed when the method used does not necessarily apply to a resource. One example would be

   OPTIONS * HTTP/1.1

免责声明:我只是自己阅读和解释这些 RFC,所以我可能是错的。

关于http - 什么时候应该在 HTTP URL 中对星号进行编码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25085992/

相关文章:

php - 何时重新生成 session ID Laravel?

java - 适当的 json 到后端 post 调用

python - 如何以 JSON 格式发送 POST 请求?

javascript - XML 名称不能以 '%' 字符开头

c - 用换行符在 C 中编码 Url?

java - 中止从 servlet 上传以限制文件大小

java - 发布数据时无法得到回复?

android - 来自 URL 的 setimageBitmap 不适合我想要的大小

ruby-on-rails - Rails当前网址帮助器

java - 如何防止 Java 8/Tomcat 7 对 Response.sendRedirect 上的 URL 路径进行 URL 编码