大家好;
我有一个桌面 Java 应用程序,它在控制台窗口中提供以下输出:
[
{
"ew" : "ana"
"hws" : [
"\u0623\u0646\u0627"
]
}
]
我想将这个字符串:“\u0623\u0646\u0627”从整个输出中分离出来,以便仅对该字符串进行进一步处理。
我不知道该怎么做?但想法之一是使用 REGEX。 但我怎样才能做到这一点呢?
你能帮我吗?
最佳答案
鉴于附加信息
The output shall be arabic letters not \u064A...etc. My idea was to search the output till the \u064A... lines and convert them to arabic. Have you get my point? I don't know how to solve this, I am a beginer in java. Sorry for the confusion and thank you for your response.
输入来自http://www.google.com/transliterate/arabic?tlqt=1&langpair=en|ar&text=ana,masry&&tl_app=1你可以这样解决:
import java.net.*;
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class URLConnectionReader {
public static void main(String[] args) throws Exception {
URL googleUrl = new URL("http://www.google.com/transliterate/arabic?tlqt=1&langpair=en|ar&text=ana,masry&&tl_app=1");
URLConnection googleUrlc = googleUrl.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(googleUrlc.getInputStream()));
String inputLine;
Pattern wordRegex = Pattern.compile("\"(\\\\u[\\da-z]{4})+\"", Pattern.CASE_INSENSITIVE);
Pattern charRegex = Pattern.compile("\\\\u([\\da-z]{4})", Pattern.CASE_INSENSITIVE);
while ((inputLine = in.readLine()) != null) {
Matcher wordMatch = wordRegex.matcher(inputLine);
for (int i = 0; wordMatch.find(); i++) {
StringBuffer arabicBuffer = new StringBuffer();
Matcher charMatch = charRegex.matcher(wordMatch.group());
for (int j = 0; charMatch.find(); j++) {
arabicBuffer.appendCodePoint(Integer.valueOf(charMatch.group(1), 16));
}
if (0 < arabicBuffer.length()) {
System.out.println(arabicBuffer.toString());
}
}
}
in.close();
}
}
关于java - 如何用Java匹配字符串中的阿拉伯Unicode字符?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5457264/