在 uima ruta 中是否可以进行排序。 例如:
输入文件:
some text
Fig 1.1
Table 1.1
Fig 1.2
some text
Pic 1.2
Table 1.2
some text
Table 1.3
Pic 1.3
some text
Fig 1.4
some text
Table 1.4
some text
Table 1.5
Fig 1.6
Box 1.1
Fig 1.5
如何找到丢失的图形(图1.3)
最佳答案
这是一个如何使用 UIMA Ruta 2.5.0 完成的示例。
输入文字:
some text
Fig 1.1
some text
Pic 1.2
some text
Pic 1.3
some text
Fig 1.4
some text
规则脚本:
DECLARE FigureInd;
DECLARE FigureMention (INT chapter, INT section);
ACTION FM(INT chap, INT sect) = CREATE(FigureMention, "chapter" = chap, "section" = sect);
"Fig"-> FigureInd;
INT c, s;
(FigureInd NUM{PARSE(c)} PERIOD NUM{PARSE(s)}){-> FM(c,s)};
DECLARE FigMissing;
f1:FigureMention #{-> FigMissing} f2:FigureMention
{f1.chapter == f2.chapter, f1.section < (f2.section - 1)};
INT pc, ps;
f:FigureMention{-> pc=f.chapter, ps=f.section}
FigMissing->{
(ANY @NUM{PARSE(c)} PERIOD NUM{PARSE(s)}){c==pc,s==ps+1-> FM(c,s), pc=c, ps=s};
};
创建的 FigureMention 注释:
Fig 1.1
Pic 1.2
Pic 1.3
Fig 1.4
UIMA Ruta 2.4.0 的解决方案非常相似,但不允许直接使用注释标签表达式的功能。这些特征的值需要存储在额外的变量中。并且需要在变量的 setter 之后应用 bool 检查。这是 UIMA Ruta 2.4.0 的解决方案:
DECLARE FigureInd;
DECLARE FigureMention (INT chapter, INT section);
ACTION FM(INT chap, INT sect) = CREATE(FigureMention, "chapter" = chap, "section" = sect);
"Fig"-> FigureInd;
INT c, s;
(FigureInd NUM{PARSE(c)} PERIOD NUM{PARSE(s)}){-> FM(c,s)};
DECLARE FigMissing;
INT c1,c2,s1,s2;
(FigureMention<-{FigureMention{-> ASSIGN(c1, FigureMention.chapter), ASSIGN(s1, FigureMention.section)};}
#{-> FigMissing}
FigureMention<-{FigureMention{-> ASSIGN(c2, FigureMention.chapter), ASSIGN(s2, FigureMention.section)};})
{c1 == (c2), s1 < (s2 - 1)};
INT pc, ps;
f:FigureMention{-> pc=FigureMention.chapter, ps=FigureMention.section}
FigMissing->{
(ANY @NUM{PARSE(c)} PERIOD NUM{PARSE(s)}){c==(pc),s==(ps+1)-> FM(c,s), pc=c, ps=s};
};
(免责声明:我是 UIMA Ruta 的开发人员)
关于uima - 是否可以测序-uima ruta,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37811878/