java - 提取中心名词

我想知道如何提取中心名词？我使用了一个不起作用的选区解析器，但我想我必须使用依赖解析器。我运行了这个演示代码，但它给了我一个错误的答案。

public class dependencydemo {
  public static void main(String[] args) throws IOException {
    PrintWriter out;
    if (args.length > 1) {
      out = new PrintWriter(args[1]);
    } else {
      out = new PrintWriter(System.out);
    }



    StanfordCoreNLP pipeline = new StanfordCoreNLP();
    Annotation annotation;
    if (args.length > 0) {
      annotation = new       ` 
 Annotation(IOUtils.slurpFileNoExceptions(args[0]));`
    } else {
      annotation = new Annotation("Yesterday, I went to the Dallas `Country Club to play 25 cent Bingo.  While I was there I talked to my `friend Jim and we both agree that those people in Washington are `destroying our economy.");`
    }

    pipeline.annotate(annotation);
    pipeline.prettyPrint(annotation, out);


    List<CoreMap> sentences = `annotation.get(CoreAnnotations.SentencesAnnotation.class);`
    if (sentences != null && sentences.size() > 0) {
      CoreMap sentence = sentences.get(0);
      Tree tree = `sentence.get(TreeCoreAnnotations.TreeAnnotation.class);`
     // out.println();
    //  out.println("The first sentence parsed is:");
      tree.pennPrint(out);
    }
   }

输出:

(ROOT
  (S
    (NP-TMP (NN Yesterday))
    (, ,)
    (NP (PRP I))
    (VP (VBD went)
      (PP (TO to)
        (NP (DT the) (NNP Dallas) (NNP Country) (NNP Club)))
      (S
        (VP (TO to)
          (VP (VB play)
            (S
              (NP (CD 25) (NN cent))
              (NP (NNP Bingo)))))))
    (. .)))

依赖关系:

root(ROOT-0, went-4)
tmod(went-4, Yesterday-1)
nsubj(went-4, I-3)
det(Club-9, the-6)
nn(Club-9, Dallas-7)
nn(Club-9, Country-8)
prep_to(went-4, Club-9)
aux(play-11, to-10)
xcomp(went-4, play-11)
num(cent-13, 25-12)
nsubj(Bingo-14, cent-13)
xcomp(play-11, Bingo-14)

如何从中提取中心名词？除此之外，输出似乎不正确。

最佳答案

根据您在评论中的解释，我的印象是您想要所有名词短语的中心成分。使用 CoreNLP 可以很容易地做到这一点。

首先，找到所有名词短语。您可以使用简单的 Tregex 模式来完成此操作(请参阅 Chris Manning's relevant answer )。
您可以使用 CoreNLP“中心查找器”来选择匹配名词短语的句法中心成分。参见例如ModCollinsHeadFinder .

演示代码如下。

// Fetch a head finder.
HeadFinder hf = new PennTreebankLanguagePack().headFinder();

Tree myTree = ...
TregexPattern tPattern = TregexPattern.compile("NP");
TregexMatcher tMatcher = tPattern.matcher(myTree);
while (tMatcher.find()) {
  Tree nounPhrase = tMatcher.getMatch();

  Tree headConstituent = hf.determineHead(nounPhrase);
  System.out.println(headConstituent);
}

关于java - 提取中心名词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29265488/

java - 提取中心名词

上一篇：java - Android 中的纪元时间与 PHP 不同

下一篇：java - 获取已加载到 webview 中的 URL 的 HttpResponse