java - 如何让 Java 匹配 JavaScript 的 encodeURIComponent() 方法?

标签 java javascript url encoding utf-8

我正在尝试在包含特殊字符的 URL 中传递此字符串,我可以让它工作的唯一方法是使用 JavaScript encodeURIComponent('testerๆ๘ๅ') 生成 "tester%C3%A6 %C3%B8%C3%A5

我在 Java 中尝试做的所有事情都会产生不同的编码,并且在另一端不起作用...知道如何将 testerๆ๘ๅ 编码为 tester%C3%A6%C3%B8%C3%A5在 java ?提前致谢!

package com.mastercard.cp.sdng.domain.user;

import org.apache.commons.lang.StringUtils;

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import java.io.UnsupportedEncodingException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URLEncoder;

public class UrlEncodingSample
{
    public static void main(String[] args)
    {
        String userId = "dummy";
        try
        {
            validateEncoding(userId);

            userId = "testeræøå";

            validateEncoding(userId);

            userId = URLEncoder.encode(userId);

            validateEncoding(userId);
        }
        catch (UnsupportedEncodingException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }

    }

    private static void validateEncoding(String userId) throws UnsupportedEncodingException
    {
        System.out.println("------ START TESTING WITH USER ID = '"+userId+"' ----------------------");
        System.out.println("Test URLEncoder.encode(userId): " + URLEncoder.encode(userId));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-8\"): " + URLEncoder.encode(userId, "UTF-8"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16\"): " + URLEncoder.encode(userId,"UTF-16"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16LE\"): " + URLEncoder.encode(userId,"UTF-16LE"));
        System.out.println("Test URLEncoder.encode(userId,\"UTF-16BE\"): " + URLEncoder.encode(userId,"UTF-16BE"));

        ScriptEngine engine = new ScriptEngineManager().getEngineByName("JavaScript");
        try
        {
            System.out.println("Test engine.eval(\"encodeURIComponent(\\\"\"+userId+\"\\\")\"): " +
                    engine.eval("encodeURIComponent(\""+userId+"\")"));
        }
        catch (ScriptException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("Test encodeURIComponent(userId): " + encodeURIComponent(userId));
        try
        {
            System.out.println("TEST new URI(userId).toASCIIString(): " + new URI(userId).toASCIIString());
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
        }
        System.out.println("------ END TESTING WITH USER ID = '"+userId+"' ----------------------\n\n");

    }



    public static String encodeURIComponent(String input) {
        if(StringUtils.isEmpty(input)) {
            return input;
        }

        int l = input.length();
        StringBuilder o = new StringBuilder(l * 3);
        try {
            for (int i = 0; i < l; i++) {
                String e = input.substring(i, i + 1);
                if (ALLOWED_CHARS.indexOf(e) == -1) {
                    byte[] b = e.getBytes("utf-8");
                    o.append(getHex(b));
                    continue;
                }
                o.append(e);
            }
            return o.toString();
        } catch(UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        return input;
    }

    private static String getHex(byte buf[]) {
        StringBuilder o = new StringBuilder(buf.length * 3);
        for (int i = 0; i < buf.length; i++) {
            int n = (int) buf[i] & 0xff;
            o.append("%");
            if (n < 0x10) {
                o.append("0");
            }
            o.append(Long.toString(n, 16).toUpperCase());
        }
        return o.toString();
    }

    public static final String ALLOWED_CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.!~*'()";
}

上面类的输出是这样的:


    ------ START TESTING WITH USER ID = 'dummy' ----------------------
    Test URLEncoder.encode(userId): dummy
    Test URLEncoder.encode(userId,"UTF-8"): dummy
    Test URLEncoder.encode(userId,"UTF-16"): dummy
    Test URLEncoder.encode(userId,"UTF-16LE"): dummy
    Test URLEncoder.encode(userId,"UTF-16BE"): dummy
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
    Test encodeURIComponent(userId): dummy
    TEST new URI(userId).toASCIIString(): dummy
    ------ END TESTING WITH USER ID = 'dummy' ----------------------


    ------ START TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------
    Test URLEncoder.encode(userId): tester%E6%F8%E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%E0%B9%86%E0%B9%98%E0%B9%85
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%0E%46%0E%58%0E%45
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%46%0E%58%0E%45%0E
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%0E%46%0E%58%0E%45
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%e0%b9%86%e0%b9%98%e0%b9%85
    Test encodeURIComponent(userId): tester%E0%B9%86%E0%B9%98%E0%B9%85
    TEST new URI(userId).toASCIIString(): tester%E0%B9%86%E0%B9%98%E0%B9%85
    ------ END TESTING WITH USER ID = 'testerๆ๘ๅ' ----------------------


    ------ START TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------
    Test URLEncoder.encode(userId): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-8"): tester%25E6%25F8%25E5
    Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25E6%FE%FF%00%25F8%FE%FF%00%25E5
    Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00E6%25%00F8%25%00E5
    Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25E6%00%25F8%00%25E5
    Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25E6%25F8%25E5
    Test encodeURIComponent(userId): tester%25E6%25F8%25E5
    TEST new URI(userId).toASCIIString(): tester%E6%F8%E5
    ------ END TESTING WITH USER ID = 'tester%E6%F8%E5' ----------------------

注意:当我写这篇文章时,我突然想到我可以使用 URLEncoder.encode(userId, "UTF-8"),只要我在另一边使用正确的解码器...但是我仍在尝试找到一种方法来对其进行编码以匹配 JavaScript encodeURIComponent 函数,该函数显然无需在另一端对其进行解码即可工作。 :)

最佳答案

根据 Mozilla Developer Docs encodeURICompoent() 使用 UTF-8 编码。当我在你的字符串上运行它时,我会按预期得到 tester%C3%A6%C3%B8%C3%A5 。当我运行以下 Java 代码时:

System.out.println(URLEncoder.encode("testeræøå", "UTF-8"));

它还打印 tester%C3%A6%C3%B8%C3%A5。我也运行了你的测试并得到:

    ------ START TESTING WITH USER ID = 'dummy' ----------------------
Test URLEncoder.encode(userId): dummy
Test URLEncoder.encode(userId,"UTF-8"): dummy
Test URLEncoder.encode(userId,"UTF-16"): dummy
Test URLEncoder.encode(userId,"UTF-16LE"): dummy
Test URLEncoder.encode(userId,"UTF-16BE"): dummy
Test engine.eval("encodeURIComponent(\""+userId+"\")"): dummy
Test encodeURIComponent(userId): dummy
TEST new URI(userId).toASCIIString(): dummy
------ END TESTING WITH USER ID = 'dummy' ----------------------


------ START TESTING WITH USER ID = 'testeræøå' ----------------------
Test URLEncoder.encode(userId): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-8"): tester%C3%A6%C3%B8%C3%A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%E6%00%F8%00%E5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%E6%00%F8%00%E5%00
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%E6%00%F8%00%E5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%C3%A6%C3%B8%C3%A5
Test encodeURIComponent(userId): tester%C3%A6%C3%B8%C3%A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'testeræøå' ----------------------


------ START TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------
Test URLEncoder.encode(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-8"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test URLEncoder.encode(userId,"UTF-16"): tester%FE%FF%00%25C3%FE%FF%00%25A6%FE%FF%00%25C3%FE%FF%00%25B8%FE%FF%00%25C3%FE%FF%00%25A5
Test URLEncoder.encode(userId,"UTF-16LE"): tester%25%00C3%25%00A6%25%00C3%25%00B8%25%00C3%25%00A5
Test URLEncoder.encode(userId,"UTF-16BE"): tester%00%25C3%00%25A6%00%25C3%00%25B8%00%25C3%00%25A5
Test engine.eval("encodeURIComponent(\""+userId+"\")"): tester%25C3%25A6%25C3%25B8%25C3%25A5
Test encodeURIComponent(userId): tester%25C3%25A6%25C3%25B8%25C3%25A5
TEST new URI(userId).toASCIIString(): tester%C3%A6%C3%B8%C3%A5
------ END TESTING WITH USER ID = 'tester%C3%A6%C3%B8%C3%A5' ----------------------

这是我所期望的。

我认为您需要检查 Java 源文件的文件编码。如果您使用的是 Eclipse,由于某种原因它默认为 cp1252。安装 Eclipse 时我做的第一件事是将默认编码更改为 UTF-8。

关于java - 如何让 Java 匹配 JavaScript 的 encodeURIComponent() 方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25298063/

相关文章:

javascript - 更新在新浏览器选项卡中打开的 pdf 的当前页码

c# - html agility pack url scraping——获取完整的html链接

java - 从 J2SE 5.0 学习 Java SE 6 有多难?

java - Google App Engine - 这只是一个侥幸,还是可以更改应用程序的版本来缩短冷启动时间?

java - 无法从 Android 连接到 SQL 服务器

javascript - Node.js:自定义命令行界面

javascript - 对上传的文本文件进行处理

Javascript执行器 : setAttribute using name

java - 媒体播放器在 R.raw 上失败 - Android

java - 如何检查方法级别的 spring 安全性