java - UIMA RUTA 中的数组 IndexOutOfBound 异常

标签 java exception uima ruta

我正在尝试从 java 上下文运行 UIMA RUTA 脚本,但出现异常。

java.lang.ArrayIndexOutOfBoundsException: -1
    at org.apache.uima.ruta.parser.RutaParser.emitErrorMessage(RutaParser.java:306)
    at org.apache.uima.ruta.parser.RutaParser.reportError(RutaParser.java:327)
    at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:613)
    at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
    at org.apache.uima.ruta.parser.RutaParser.file_input(RutaParser.java:566)
    at org.apache.uima.ruta.engine.RutaEngine.loadScriptIS(RutaEngine.java:939)

在深入研究时没有其他消息,我看到了这个异常。

MissingTokenException(inserted [@-1,0:0='<missing EOF>',<-1>,15:0] at CALL)

看起来子脚本抛出异常,但相同的脚本在 UIMA 工作台中给出了正确的输出。

我在这里缺少什么?

引擎

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
    <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
    <primitive>true</primitive>
    <annotatorImplementationName>org.apache.uima.ruta.engine.RutaEngine</annotatorImplementationName>
    <analysisEngineMetaData>
        <name>org.test.MainEngine</name>
        <description/>
        <version>1.0</version>
        <vendor/>
        <configurationParameters searchStrategy="language_fallback">
            <configurationParameter>
                <name>seeders</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debug</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalScripts</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>profile</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debugWithMatches</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>statistics</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalEngines</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalExtensions</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>debugOnlyFor</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>scriptEncoding</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalEngineLoaders</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>resourcePaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>defaultFilteredTypes</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>mainScript</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>scriptPaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>descriptorPaths</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>removeBasics</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>dynamicAnchoring</name>
                <description>Activates dynamic anchoring (possible speed up).</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>greedyRuleElement</name>
                <description>Activates greedy anchoring for rule elements.</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>greedyRule</name>
                <description>Activates greedy anchoring for complete rules.</description>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>lowMemoryProfile</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>createdBy</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>simpleGreedyForComposed</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>additionalUimafitEngines</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>strictImports</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>varNames</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>varValues</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>rules</name>
                <type>String</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>dictRemoveWS</name>
                <type>Boolean</type>
                <multiValued>false</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
            <configurationParameter>
                <name>reindexOnly</name>
                <type>String</type>
                <multiValued>true</multiValued>
                <mandatory>false</mandatory>
            </configurationParameter>
        </configurationParameters>
        <configurationParameterSettings>
            <nameValuePair>
                <name>debug</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>profile</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>debugWithMatches</name>
                <value>
                    <boolean>true</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>defaultFilteredTypes</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.type.SPACE</string>
                        <string>org.apache.uima.ruta.type.BREAK</string>
                        <string>org.apache.uima.ruta.type.MARKUP</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>removeBasics</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>seeders</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.seed.DefaultSeeder</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>createdBy</name>
                <value>
                    <boolean>false</boolean>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>mainScript</name>
                <value>
                    <string>org.test.Main</string>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>scriptPaths</name>
                <value>
                    <array>
                        <string>../../scripts</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>descriptorPaths</name>
                <value>
                    <array>
                        <string>../descriptor</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>resourcePaths</name>
                <value>
                    <array>
                        <string>/Users/Gaurav/Documents/workspace/Paragraph/resources</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalScripts</name>
                <value>
                    <array>
                     <string>org.test.Email</string>
                        <!--<string>org.test.number.Number</string>
                        <string>org.test.Date</string>-->
                        <!--<string>org.test.USAAddress</string>
                        <string>org.test.Name</string>
                        <string>org.test.number.PhoneNumber</string>-->
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalEngines</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalUimafitEngines</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalExtensions</name>
                <value>
                    <array>
                        <string>org.apache.uima.ruta.string.bool.BooleanOperationsExtension</string>
                        <string>org.apache.uima.ruta.string.StringOperationsExtension</string>
                        <string>org.apache.uima.ruta.block.OnlyFirstBlockExtension</string>
                        <string>org.apache.uima.ruta.block.OnlyOnceBlockExtension</string>
                        <string>org.apache.uima.ruta.block.fst.FSTBlockExtension</string>
                    </array>
                </value>
            </nameValuePair>
            <nameValuePair>
                <name>additionalEngineLoaders</name>
                <value>
                    <array/>
                </value>
            </nameValuePair>
        </configurationParameterSettings>
        <typeSystemDescription>
            <imports>
                <import location="MainTypeSystem.xml"/>
            </imports>
        </typeSystemDescription>
        <typePriorities>
            <priorityList>
                <type>org.apache.uima.ruta.type.RutaFrame</type>
                <type>uima.tcas.Annotation</type>
                <type>org.apache.uima.ruta.type.RutaBasic</type>
            </priorityList>
        </typePriorities>
        <fsIndexCollection/>
        <capabilities>
            <capability>
                <inputs/>
                <outputs/>
                <languagesSupported/>
            </capability>
            <capability>
                <inputs>
                    <type>org.test.Main.Filters</type>
                </inputs>
                <outputs>
                    <type>org.test.Main.Filters</type>
                </outputs>
                <languagesSupported/>
            </capability>
        </capabilities>
        <operationalProperties>
            <modifiesCas>true</modifiesCas>
            <multipleDeploymentAllowed>true</multipleDeploymentAllowed>
            <outputsNewCASes>true</outputsNewCASes>
        </operationalProperties>
    </analysisEngineMetaData>
    <resourceManagerConfiguration/>
</analysisEngineDescription>

脚本

PACKAGE org.test;

SCRIPT org.test.Email;

Document{->LOG("starting Processed")};
WORDLIST FiltersList = 'test/dictionaries/ValueFilters.txt';
DECLARE Filters;
DocumentAnnotation{-> MARKFAST(Filters, FiltersList)};


CALL(Email);

Document{-> ADDRETAINTYPE(MARKUP)};

最佳答案

问题已修复,Ruta 子脚本调用存在问题。

我们必须使用以下语法来调用下标

DocumentAnnotation{->CALL(Email)};

而不是

CALL(Email); 

关于java - UIMA RUTA 中的数组 IndexOutOfBound 异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43186949/

相关文章:

c# - 我们如何以及在哪里编写 try catch block 来处理异常

exception - Windows Update 1703 后 UWP 后台传输异常 0x80072EE4

c++ - 为什么重新抛出异常会丢弃 'what()' 给出的信息?

基于注释的JavaDoc

xml - 使用 UIMA 从 XML 文件中提取文本

java - 使用 userId 创建对象列表

java - 如果有必要的数据库

java - Java中是否有一个类保留重复项但不保留数据顺序?

java - 组合框 : java. sql.SQLException 中的错误:游标状态无效 - 没有当前行

apache - UIMA ruta 中的模糊性