java - URL 对查询路径中的字符 @ 进行编码

有些地方/库似乎将 URL 路径段中的“@”字符视为应进行编码的“特殊字符”，而有些地方/库则不然。

我正在寻找正确的版本。示例字符串:“someone@example.com”。

如果我去https://www.urlencoder.org/ ，并尝试对上面得到的字符串进行编码某人%40example.com
如果我使用 org.springframework.web.util.UriUtils 我会得到以下结果:

String s1 = UriUtils.encodePathSegment("someone@example.com", "UTF-8"); String s2 = UriUtils.encodeQueryParam("someone@example.com", "UTF-8"); String s3 = UriUtils.encodePath("someone@example.com", "UTF-8"); System.out.println("----------s1: "+ s1); System.out.println("----------s2: "+ s2); System.out.println("----------s3: "+ s3);

...输出

----------s1: someone@example.com
----------s2: someone@example.com
----------s3: someone@example.com

RestEasy-Client v4.0.0.Final 不会对路径段中的“@”字符进行编码
WSO2 ESB 在收到包含 @ char 的 Path 参数时会发出提示(好吧，它此时找不到资源)。

谁是对的，正确的结果应该是什么，“@”是否应该转换为“%40”？

最佳答案

There are places/libraries that seem to consider "@" characters in a URL Path segment as "special character" that should be encoded, and places/libraries that do not.

路径段中字符必须转义的标准是 RFC 3986, Appendix A .

path          = path-abempty    ; begins with "/" or is empty
              / path-absolute   ; begins with "/" but not "//"
              / path-noscheme   ; begins with a non-colon segment
              / path-rootless   ; begins with a segment
              / path-empty      ; zero characters

path-abempty  = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty    = 0<pchar>

请注意，根据您使用的路径生成，存在三种不同风格的段

segment       = *pchar
segment-nz    = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
              ; non-zero-length segment without any colon ":"

但是...

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

因此 @ 可以出现在任何路径段中。

有必要吗？据我所知，答案是否定的——当 @ 不充当分隔符的角色时，允许使用 pct 编码的表示形式。没有什么明确的，但是这个 observation about unreserved characters是一个提示:

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters. The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time. For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.

这表明允许非保留字符的 pct 编码，尽管这显然不是必需的。因此，在解决分隔符后，这对于其他字符也应该适用。

仅供引用:未保留的集合几乎与您所期望的一样。

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"

关于java - URL 对查询路径中的字符 @ 进行编码，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56498023/

java - URL 对查询路径中的字符 @ 进行编码

上一篇：java - libGDX 一直说 JAVA_HOME 目录不正确

下一篇：java - 逐层旋转二维数组