如果我将 http://localhost:9000/space test
URL 放入 Web 浏览器的地址栏,它会使用 http://localhost:9000/space 调用服务器%20 测试
。
http://localhost:9000/specÁÉÍtest
也会被编码为 http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
。
如果将编码后的 URL 放入地址栏(即 http://localhost:9000/space%20test
和 http://localhost:9000/spec%C3%81% C3%89%C3%8Dtest
) 它们保持不变(它们不会被双重编码)。
是否有执行此编码的任何 Java API 或库?这些 URL 来自用户,所以我不知道它们是否经过编码。
(如果没有,在输入字符串中搜索 %
是否足够,如果没有找到则进行编码,或者是否有任何特殊情况这不起作用?)
编辑:
URLEncoder.encode("space%20test", "UTF-8")
返回 space%2520test
这不是我想要的,因为它是双重的编码。
编辑 2:
此外,浏览器会处理部分编码的 URL,例如 http://localhost:9000/specÁÉ%C3%8Dtest
,嗯,不会对它们进行双重编码。在这种情况下,服务器会收到以下 URL:http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
。与...specÁÉÍtest
的编码形式相同。
最佳答案
What every web developer must know about URL encoding
为什么需要 URL 编码?
The URL specification RFC 1738 specifies that only a small set of characters
can be used in a URL. Those characters are:
A to Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
a to z (abcdefghijklmnopqrstuvwxyz)
0 to 9 (0123456789)
$ (Dollar Sign)
- (Hyphen / Dash)
_ (Underscore)
. (Period)
+ (Plus sign)
! (Exclamation / Bang)
* (Asterisk / Star)
' (Single Quote)
( (Open Bracket)
) (Closing Bracket)
URL 编码是如何工作的?
All offending characters are replaced by a % and a two digit hexadecimal value
that represents the character in the proper ISO character set. Here are a
couple of examples:
$ (Dollar Sign) becomes %24
& (Ampersand) becomes %26
+ (Plus) becomes %2B
, (Comma) becomes %2C
: (Colon) becomes %3A
; (Semi-Colon) becomes %3B
= (Equals) becomes %3D
? (Question Mark) becomes %3F
@ (Commercial A / At) becomes %40
简单示例:
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
public class TextHelper {
private static ScriptEngine engine = new ScriptEngineManager()
.getEngineByName("JavaScript");
/**
* Encoding if need escaping %$&+,/:;=?@<>#%
*
* @param str should be encoded
* @return encoded Result
*/
public static String escapeJavascript(String str) {
try {
return engine.eval(String.format("escape(\"%s\")",
str.replaceAll("%20", " "))).toString()
.replaceAll("%3A", ":")
.replaceAll("%2F", "/")
.replaceAll("%3B", ";")
.replaceAll("%40", "@")
.replaceAll("%3C", "<")
.replaceAll("%3E", ">")
.replaceAll("%3D", "=")
.replaceAll("%26", "&")
.replaceAll("%25", "%")
.replaceAll("%24", "$")
.replaceAll("%23", "#")
.replaceAll("%2B", "+")
.replaceAll("%2C", ",")
.replaceAll("%3F", "?");
} catch (ScriptException ex) {
Logger.getLogger(TextHelper.class.getName())
.log(Level.SEVERE, null, ex);
return null;
}
}
关于必要时用于 URL 编码的 Java 库(如浏览器),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14357970/