java - 从 stanford core nlp 依赖表示中提取主要语义元素

我是一名计算机科学专业的学生，现在正在做一个 NLP 项目。我已经完成了一个程序，使用以下代码将给定的输入句子转换为依赖结构表示

    private void nextActionPerformed(java.awt.event.ActionEvent evt) {                                     
    Properties props = new Properties(); 
    props.put("annotators", "tokenize, ssplit, pos, lemma, parse"); 
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props, false);
    String text = input.getText();
    Annotation document = pipeline.process(text);
    for(CoreMap sentence: document.get(SentencesAnnotation.class))
    {   
         SemanticGraph dependencies = sentence.get(CollapsedDependenciesAnnotation.class);
         System.out.println(dependencies);
              }      

}

我得到给定示例句子“A cat is sat on the table”的输出，如图-> sitting/VBG (root) -> cat/NN (nsubj) -> A/DT (det) -> is/VBZ (aux) -> table/NN (nmod:on) -> on/IN (case) -> the/DT (det)所示现在我想要的是从给定的依赖关系表示中检索主要语义元素。例如，在给定的句子中，我想检索 sitting、cat 和 table 。这是一个一般的简单句子，我想检索词根、主语和宾语。任何人请帮忙提供示例代码。

最佳答案

对于简单的情况，您可以定义 Semgrex patterns在依赖图上。例如，要提取主语/动词/宾语三元组，您可以使用以下代码:

SemgrexPattern pattern = SemgrexPattern.compile("{$}=root >/.subj(pass)?/ {}=subject >/.obj/ {}=object");
SemgrexMatcher matcher = pattern.matcher(new Sentence("A cat is sitting on the table").dependencyGraph());
while (matcher.find()) {
  IndexedWord root = matcher.getNode("root");
  IndexedWord subject = matcher.getNode("subject");
  IndexedWord object = matcher.getNode("object");
  System.err.println(root.word() + "(" + subject.word() + ", " + object.word());
}

请注意，即使您的示例也不属于这种简单的情况。您在坐和 table 之间有一个nmod:on边缘，而不是dobj边缘。随着事情变得越来越复杂，从表面上看斯坦福 OpenIE 输出可能是值得的:

new Sentence("A cat is sitting on the table").openieTriples()
  .forEach(System.err::println);

这将为您提供一个三元组(cat; is sat on; table)，也许将其后处理为(cat; Sitting; table)会更容易或者无论您的实际下游应用程序是什么。

关于java - 从 stanford core nlp 依赖表示中提取主要语义元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36003868/

java - 从 stanford core nlp 依赖表示中提取主要语义元素

上一篇：java - "Service Temporarily Unavailable"当我从 netbeans 启动项目时收到此错误

下一篇：java - JPA [Eclipselink] - 如何刷新创建的动态实体的元数据？