java - 使用 JWNL 获取与 WordNet 中的动词相关的所有名词

标签 java nlp wordnet

我正在使用 JWNL (1.4.1 rc2)。给定一个动词,我需要找到“相关”名词。例如,给定动词:bear,我想要名词birth

我可以通过WordNet在线界面看到这一点:http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=bear&i=8&h=000100000000000000000#c 。在 JWNL 中这将如何完成。

最佳答案

您可以对单词的每种含义使用 Synset,然后打印出每个 Synset 中的单词,如下所示:

IndexWord indexWord = proc.lookupBaseForm(POS.VERB,"bear");
int senseNum = 0;
for(Synset synset: indexWord.getSenses()){
    senseNum++;
    System.out.println("For sense: " + senseNum + " (" + synset.getGloss()+")");
    Word[] words = synset.getWords();
    for(Word word: words){
        System.out.println("\t"+word.getLemma()+"("+word.getPOS()+")");
    }
}

which will get you this:

For sense: 1 (have; "bear a resemblance"; "bear a signature")
    bear([POS: verb])
For sense: 2 (cause to be born; "My wife had twins yesterday!")
    give_birth([POS: verb])
    deliver([POS: verb])
    bear([POS: verb])
    birth([POS: verb])
    have([POS: verb])
For sense: 3 (put up with something or somebody unpleasant; "I cannot bear his constant criticism"; "The new secretary had to endure a lot of unprofessional remarks"; "he learned to tolerate the heat"; "She stuck out two years in a miserable marriage")
    digest([POS: verb])
    endure([POS: verb])
    stick_out([POS: verb])
    stomach([POS: verb])
    bear([POS: verb])
    stand([POS: verb])
    tolerate([POS: verb])
    support([POS: verb])
    brook([POS: verb])
    abide([POS: verb])
    suffer([POS: verb])
    put_up([POS: verb])
For sense: 4 (move while holding up or supporting; "Bear gifts"; "bear a heavy load"; "bear news"; "bearing orders")
    bear([POS: verb])
For sense: 5 (bring forth, "The apple tree bore delicious apples this year"; "The unidentified plant bore gorgeous flowers")
    bear([POS: verb])
    turn_out([POS: verb])
For sense: 6 (take on as one's own the expenses or debts of another person; "I'll accept the charges"; "She agreed to bear the responsibility")
    bear([POS: verb])
    take_over([POS: verb])
    accept([POS: verb])
    assume([POS: verb])
For sense: 7 (contain or hold; have within; "The jar carries wine"; "The canteen holds fresh water"; "This can contains water")
    hold([POS: verb])
    bear([POS: verb])
    carry([POS: verb])
    contain([POS: verb])
For sense: 8 (bring in; "interest-bearing accounts"; "How much does this savings certificate pay annually?")
    yield([POS: verb])
    pay([POS: verb])
    bear([POS: verb])
For sense: 9 (have on one's person; "He wore a red ribbon"; "bear a scar")
    wear([POS: verb])
    bear([POS: verb])
For sense: 10 (behave in a certain manner; "She carried herself well"; "he bore himself with dignity"; "They conducted themselves well during these difficult times")
    behave([POS: verb])
    acquit([POS: verb])
    bear([POS: verb])
    deport([POS: verb])
    conduct([POS: verb])
    comport([POS: verb])
    carry([POS: verb])
For sense: 11 (have rightfully; of rights, titles, and offices; "She bears the title of Duchess"; "He held the governorship for almost a decade")
    bear([POS: verb])
    hold([POS: verb])
For sense: 12 (support or hold in a certain manner; "She holds her head high"; "He carried himself upright")
    hold([POS: verb])
    carry([POS: verb])
    bear([POS: verb])
For sense: 13 (be pregnant with; "She is bearing his child"; "The are expecting another child in January"; "I am carrying his child")
    have_a_bun_in_the_oven([POS: verb])
    bear([POS: verb])
    carry([POS: verb])
    gestate([POS: verb])
    expect([POS: verb])

But that's all verb, since we are looking up verbs. If you want to get noun (as seen in the Web version), you should do some further steps.

It's called "morphosemantic"ally related words, which is defined in this file, as stated in Wordnet website. You can create your own code to extract morphosemantically related words by using the mapping available on that file.

Since this is an additional file beyond standard WordNet distribution, regretfully I believe this is not implemented in JWNL, so probably it's best if you can just create a simple code to get the mapping. First, you can convert the xls file into CSV file using any spreadsheet program such as Excel. Then you'll need to get the sense key of that sense. Unfortunately, JWNL (1.4.1 rc2) has no simple method to get the sense key. However, it's included in JWNL (1.4 rc3), which is the method getSenseKey(lemma) in the class Synset. So, assuming you upgrade to JWNL 1.4_rc3, you can do:

HashMap<String,ArrayList<String>> relatedWords = loadMorphosemanticFile();
...
relatedWords.get(word.getSynset().getSenseKey(word.getLemma()))

它将返回一个 Arraylist,其中包含:birth%1:28:00::birth%1:22:00::birth %1:11:00:: 当单词是 birth 的意义 2 为 bear 时(意义键 bear%2:29:01::,原因出生;“我的妻子昨天生了双胞胎!”),其意义键有birth%2:29:00::,如下面使用 JWNL 1.4 rc3 的输出所示:

For sense: bear%2:29:01:: (cause to be born; "My wife had twins yesterday!")
    give_birth (give_birth%2:29:00::)
    deliver (deliver%2:29:01::)
    bear (bear%2:29:01::)
    birth (birth%2:29:00::)
    have (have%2:29:00::)

我从 GrepCode 得到了这个非常好的资源

关于java - 使用 JWNL 获取与 WordNet 中的动词相关的所有名词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17275392/

相关文章:

java - 使用 Cloud Firestore 向其他用户发送或接收的消息填充 Fragment

java - Exception.getCause() 在尝试查找异常源时返回 null

java - 需要统计自然语言处理的资源

r - 在 R 中识别和分组同义词

python - 如何在执行 nltk.download() 时获取特定版本的 Wordnet

Java动画方法

java - 与eclipse连接时出现SVN错误

nlp - tf-idf(三角不等式)的余弦相似度替代方案

python - NLTK - block 语法不读取逗号

mysql - Wordnet SQL 设置