我有一个 Java String 对象,其中包含类似 "resumè" 的单词,或者包含任何国际字符的任何单词。我想要做的是将其转换为对 ASCII 字符串中的非 ASCII 字符进行编码,例如 "resum\u00E8"。我如何使用 Java 来做到这一点?
最佳答案
继承 Tagir Valeev 从 java.util.Properties 中获取的想法:
package empty;
public class CharsetEncode {
public static void main(String[] args) {
String s = "resumè";
System.out.println(decompose(s));
}
public static String decompose(String s) {
return saveConvert(s, true, true);
}
private static String saveConvert(String theString, boolean escapeSpace, boolean escapeUnicode) {
int len = theString.length();
int bufLen = len * 2;
if (bufLen < 0) {
bufLen = Integer.MAX_VALUE;
}
StringBuffer outBuffer = new StringBuffer(bufLen);
for (int x = 0; x < len; x++) {
char aChar = theString.charAt(x);
// Handle common case first, selecting largest block that
// avoids the specials below
if ((aChar > 61) && (aChar < 127)) {
if (aChar == '\\') {
outBuffer.append('\\');
outBuffer.append('\\');
continue;
}
outBuffer.append(aChar);
continue;
}
switch (aChar) {
case ' ':
if (x == 0 || escapeSpace)
outBuffer.append('\\');
outBuffer.append(' ');
break;
case '\t':
outBuffer.append('\\');
outBuffer.append('t');
break;
case '\n':
outBuffer.append('\\');
outBuffer.append('n');
break;
case '\r':
outBuffer.append('\\');
outBuffer.append('r');
break;
case '\f':
outBuffer.append('\\');
outBuffer.append('f');
break;
case '=': // Fall through
case ':': // Fall through
case '#': // Fall through
case '!':
outBuffer.append('\\');
outBuffer.append(aChar);
break;
default:
if (((aChar < 0x0020) || (aChar > 0x007e)) & escapeUnicode) {
outBuffer.append('\\');
outBuffer.append('u');
outBuffer.append(toHex((aChar >> 12) & 0xF));
outBuffer.append(toHex((aChar >> 8) & 0xF));
outBuffer.append(toHex((aChar >> 4) & 0xF));
outBuffer.append(toHex(aChar & 0xF));
} else {
outBuffer.append(aChar);
}
}
}
return outBuffer.toString();
}
private static char toHex(int nibble) {
return hexDigit[(nibble & 0xF)];
}
/** A table of hex digits */
private static final char[] hexDigit = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' };
}
关于java - 需要使用 Java 将带有 è 的 Java 字符串转换为\u00E8,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31000219/