java - URL 对查询路径中的字符 @ 进行编码

标签 java rest http urlencode

有些地方/库似乎将 URL 路径段中的“@”字符视为应进行编码的“特殊字符”,而有些地方/库则不然。

我正在寻找正确的版本。 示例字符串:“someone@example.com”。

  • 如果我去https://www.urlencoder.org/ ,并尝试对上面得到的字符串进行编码 某人%40example.com
  • 如果我使用 org.springframework.web.util.UriUtils 我会得到以下结果:

    String s1 = UriUtils.encodePathSegment("someone@example.com", "UTF-8"); String s2 = UriUtils.encodeQueryParam("someone@example.com", "UTF-8"); String s3 = UriUtils.encodePath("someone@example.com", "UTF-8"); System.out.println("----------s1: "+ s1); System.out.println("----------s2: "+ s2); System.out.println("----------s3: "+ s3);

...输出

----------s1: someone@example.com
----------s2: someone@example.com
----------s3: someone@example.com
  • RestEasy-Client v4.0.0.Final 不会对路径段中的“@”字符进行编码
  • WSO2 ESB 在收到包含 @ char 的 Path 参数时会发出提示(好吧,它此时找不到资源)。

谁是对的,正确的结果应该是什么,“@”是否应该转换为“%40”?

最佳答案

There are places/libraries that seem to consider "@" characters in a URL Path segment as "special character" that should be encoded, and places/libraries that do not.

路径段中字符必须转义的标准是 RFC 3986, Appendix A .

path          = path-abempty    ; begins with "/" or is empty
              / path-absolute   ; begins with "/" but not "//"
              / path-noscheme   ; begins with a non-colon segment
              / path-rootless   ; begins with a segment
              / path-empty      ; zero characters

path-abempty  = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty    = 0<pchar>

请注意,根据您使用的路径生成,存在三种不同风格的段

segment       = *pchar
segment-nz    = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
              ; non-zero-length segment without any colon ":"

但是...

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

因此 @ 可以出现在任何路径段中。

有必要吗?据我所知,答案是否定的——当 @ 不充当分隔符的角色时,允许使用 pct 编码的表示形式。没有什么明确的,但是这个 observation about unreserved characters是一个提示:

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters. The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time. For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.

这表明允许非保留字符的 pct 编码,尽管这显然不是必需的。因此,在解决分隔符后,这对于其他字符也应该适用。

仅供引用:未保留的集合几乎与您所期望的一样。

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"

关于java - URL 对查询路径中的字符 @ 进行编码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56498023/

相关文章:

java - 如何在 Java REST 服务中验证 XML 输入非空属性?

在 AVRO 中序列化的 Python HTTP POST 请求

android - 如何通过 Android 应用更新/修改现有实体(在 Google 应用引擎上存储为 JSON 对象)

java - 在没有 'proper'compareTo 方法的情况下对 Java 中的对象进行排序

java - Java 编译器上下文中的 'generated source files' 是什么?

api - 内容范围 header - 允许的单位?

apache - 如何启用 CORS 的 Apache Web 服务器(包括预检和自定义 header )?

java检测点击的按钮

Java 日期处理 - 3 月 29 日

javascript - 调用REST api获取图像