我有一个大字符串,我想获取格式 [[someword]]
的所有子字符串从中。
意思是,获取包含在左方括号和右方括号中的所有单词(列表)。
现在,一种方法是按空格拆分字符串,然后使用此过滤器过滤列表,但有时会出现问题 [[someword]]
不作为词存在,它可能有 ,
、空格或 .
就在它之前或之后。
做这个的最好方式是什么?
I will appreciate a solution in Scala but as this is more of a programming problem, I will convert your solution to Scala if it's in some other language I know e.g. Python.
This question is different from marked duplicate because the regex needs to able to accommodate characters other than English characters in between the brackets.
最佳答案
你可以用这个(?<=\[{2})[^[\]]+(?=\]{2})
regex 匹配和提取包含在双方括号中的所有您需要的单词。
这是一个Python解决方案,
import re
s = 'some text [[someword]] some [[some other word]]other text '
print(re.findall(r'(?<=\[{2})[^[\]]+(?=\]{2})', s))
打印,
['someword', 'some other word']
我从未在 Scala 工作过,但这里有一个 Java 解决方案,据我所知,Scala 仅基于 Java,因此这可能会有所帮助。
String s = "some text [[someword]] some [[some other word]]other text ";
Pattern p = Pattern.compile("(?<=\\[{2})[^\\[\\]]+(?=\\]{2})");
Matcher m = p.matcher(s);
while(m.find()) {
System.out.println(m.group());
}
打印,
someword
some other word
如果这就是您要找的,请告诉我。
关于regex - 如何从字符串中获取特定格式的所有子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55454136/