请帮我解决这个正则表达式,我需要每个第一个元映射的所有组件。
短语:.\n元映射*.* 这之后会发生什么? 我今天刚开始学习正则表达式。
到目前为止,我有点卡住了。我有下面的文档以及我想要的输出。
主要文档:
Phrase: "is"
Phrase: "normal."
Meta Mapping (1000):
1000 % Normal (Mean Percent of Normal) [Quantitative Concept]
Meta Mapping (1000):
1000 Normal [Qualitative Concept]
Meta Mapping (1000):
1000 % normal (Percent normal) [Quantitative Concept]
Processing 00000000.tx.8: The EKG shows nonspecific changes.
Phrase: "The EKG"
Meta Mapping (1000):
1000 EKG (Electrocardiogram) [Finding]
Meta Mapping (1000):
1000 EKG (Electrocardiography) [Diagnostic Procedure]
Phrase: "shows"
Meta Mapping (1000):
1000 Show [Intellectual Product]
Phrase: "nonspecific changes."
Meta Mapping (901):
694 Nonspecific [Idea or Concept]
861 changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
694 Nonspecific [Idea or Concept]
861 changes (Changing) [Functional Concept]
Meta Mapping (901):
694 Non-specific (Unspecified) [Qualitative Concept]
861 changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
694 Non-specific (Unspecified) [Qualitative Concept]
861 changes (Changing) [Functional Concept]
<小时/>
我希望结果中每个短语只有一个元映射。
所以
Phrase: "normal."
Meta Mapping (1000):
1000 % Normal (Mean Percent of Normal) [Quantitative Concept]
Phrase: "The EKG"
Meta Mapping (1000):
1000 EKG (Electrocardiogram) [Finding]
Phrase: "shows"
Meta Mapping (1000):
1000 Show [Intellectual Product]
Phrase: "nonspecific changes."
Meta Mapping (901):
694 Nonspecific [Idea or Concept]
861 changes (Changed status) [Quantitative Concept]
请帮我解决这个正则表达式,我需要每个第一个元映射的所有组件。谢谢!
最佳答案
我认为这个re可能适合你。只是重新,与 awk 无关。在这里测试regex101.com/
Phrase.*\nMeta.*\n^((?![Meta|\n]).*\n)*
gnu awk 版本:
cat your_data_file | awk '
BEGIN {
FS="\n"
RS="\n\n"
OFS="\n"
}
NF > 1 {
print $1, $2
for (i = 3; i <= NF; i++)
if (match($i, "Meta Mapping")) {
print ""
next
}
else
print $i
print ""
}
'
关于regex - 从行 block 中有选择地提取行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24027434/