我想通过在java中使用正则表达式来提取位于中间的url的一部分
这就是我尝试过的,检测 java+regex
的主要问题是它位于网址最后一部分的中间,我不知道如何忽略它后面的字符,我的正则表达式只是忽略之前它:
String regex = "https://www\\.google\\.com/(search)?q=([^/]+)/";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);
if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
}
结果应该是 regex+java
甚至是 regex java
。但我的代码没有成功...
最佳答案
尝试:
String regex = "https://www\\.google\\.com/search\\?q=([^&]+).*";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);
if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
结果是:
https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
regex+java
编辑
打印前替换所有加号:
for (int i = 0; i <= n; ++i) {
String str = matcher.group (i).replaceAll("\\+", " ");
System.out.println (str);
}
关于java - 使用正则表达式提取网址的特定部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9927297/