有些地方/库似乎将 URL 路径段中的“@”字符视为应进行编码的“特殊字符”,而有些地方/库则不然。
我正在寻找正确的版本。 示例字符串:“someone@example.com”。
- 如果我去https://www.urlencoder.org/ ,并尝试对上面得到的字符串进行编码 某人%40example.com
如果我使用 org.springframework.web.util.UriUtils 我会得到以下结果:
String s1 = UriUtils.encodePathSegment("someone@example.com", "UTF-8"); String s2 = UriUtils.encodeQueryParam("someone@example.com", "UTF-8"); String s3 = UriUtils.encodePath("someone@example.com", "UTF-8"); System.out.println("----------s1: "+ s1); System.out.println("----------s2: "+ s2); System.out.println("----------s3: "+ s3);
...输出
----------s1: someone@example.com
----------s2: someone@example.com
----------s3: someone@example.com
- RestEasy-Client v4.0.0.Final 不会对路径段中的“@”字符进行编码
- WSO2 ESB 在收到包含 @ char 的 Path 参数时会发出提示(好吧,它此时找不到资源)。
谁是对的,正确的结果应该是什么,“@”是否应该转换为“%40”?
最佳答案
There are places/libraries that seem to consider "@" characters in a URL Path segment as "special character" that should be encoded, and places/libraries that do not.
路径段中字符必须转义的标准是 RFC 3986, Appendix A .
path = path-abempty ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characters
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>
请注意,根据您使用的路径生成,存在三种不同风格的段
segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
但是...
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
因此 @
可以出现在任何路径段中。
有必要吗?据我所知,答案是否定的——当 @
不充当分隔符的角色时,允许使用 pct 编码的表示形式。没有什么明确的,但是这个 observation about unreserved characters是一个提示:
When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters. The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time. For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.
这表明允许非保留字符的 pct 编码,尽管这显然不是必需的。因此,在解决分隔符后,这对于其他字符也应该适用。
仅供引用:未保留的集合几乎与您所期望的一样。
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
关于java - URL 对查询路径中的字符 @ 进行编码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56498023/