java - 斯坦福 NLP 情感模糊结果

标签 java algorithm nlp stanford-nlp sentiment-analysis

我正在使用 Stanford NLP v3.6 (JAVA) 来计算英语 句子的情感

Stanford NLP 从 0 到 4 计算句子的极性。

  • 0 非常消极
  • 1 次否定
  • 2 中立
  • 3 阳性
  • 4 非常积极

我运行了一些非常简单的测试用例,但得到了非常奇怪的结果。

示例:

  1. 文本 = Jhon 是好人,情感 = 3(即积极)
  2. Text = David is good person, Sentiment = 2(即中立)

在上面的例子中,除了名字DavidJhon之外,句子是一样的,但是情感值是不同的。 这不是歧义吗

我使用这段 Java 代码来计算情绪:

 public static float calSentiment(String text) {

            // pipeline must get initialized before proceeding further
            Properties props = new Properties();
            props.setProperty("annotators", "tokenize, ssplit,   parse, sentiment");
            StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

            int mainSentiment = 0;
            if (text != null && text.length() > 0) {
                int longest = 0;
                Annotation annotation = pipeline.process(text);

                for (CoreMap sentence : annotation.get(CoreAnnotations.SentencesAnnotation.class)) {
                    Tree tree = sentence.get(SentimentCoreAnnotations.SentimentAnnotatedTree.class);
                    int sentiment = RNNCoreAnnotations.getPredictedClass(tree);
                    String partText = sentence.toString();

                    if (partText.length() > longest) {
                        mainSentiment = sentiment;
                        longest = partText.length();
                    }
                }
            }
            if (mainSentiment > 4 || mainSentiment < 0) {
                return -9999;
            }
            return mainSentiment;

        }

我是否遗漏了 java 代码中的某些内容

当句子是正面时,我也得到负面情绪(即小于 2),反之亦然。

谢谢。

以下是我用简单的英语句子得到的结果:

Sentence: Tendulkar is a great batsman
Sentiment: 3
Sentence: David is a great batsman
Sentiment: 3
Sentence: Tendulkar is not a great batsman
Sentiment: 1
Sentence: David is not a great batsman
Sentiment: 2
Sentence: Shyam is not a great batsman
Sentiment: 1
Sentence: Dhoni loves playing football
Sentiment: 3
Sentence: John, Julia loves playing football
Sentiment: 3
Sentence: Drake loves playing football
Sentiment: 3
Sentence: David loves playing football
Sentiment: 2
Sentence: Virat is a good boy
Sentiment: 2
Sentence: David is a good boy
Sentiment: 2
Sentence: Virat is not a good boy
Sentiment: 1
Sentence: David is not a good boy
Sentiment: 2
Sentence: I love every moment of life
Sentiment: 3
Sentence: I hate every moment of life
Sentiment: 2
Sentence: I like dancing and listening to music
Sentiment: 3
Sentence: Messi does not like to play cricket
Sentiment: 1
Sentence: This was the worst movie I have ever seen
Sentiment: 0
Sentence: I really appreciated the movie
Sentiment: 1
Sentence: I really appreciate the movie
Sentiment: 3
Sentence: Varun talks in a condescending way
Sentiment: 2
Sentence: Ram is angry he did not win the tournament
Sentiment: 1
Sentence: Today's dinner was awful
Sentiment: 1
Sentence: Johny is always complaining
Sentiment: 3
Sentence: Modi's demonetisation has been very controversial and confusing
Sentiment: 1
Sentence: People are left devastated by floods and droughts
Sentiment: 2
Sentence: Chahal did a fantastic job by getting the 6 wickets
Sentiment: 3
Sentence: England played terribly bad
Sentiment: 1
Sentence: Rahul Gandhi is a funny man
Sentiment: 3
Sentence: Always be grateful to those who are generous towards you
Sentiment: 3
Sentence: A friend in need is a friend indeed
Sentiment: 3
Sentence: Mary is a jubilant girl
Sentiment: 2
Sentence: There is so much of love and hatred in this world
Sentiment: 3
Sentence: Always be positive
Sentiment: 3
Sentence: Always be negative
Sentiment: 1
Sentence: Never be negative
Sentiment: 1
Sentence: Stop complaining and start doing something
Sentiment: 2
Sentence: He is a awesome thief
Sentiment: 3
Sentence: Ram did unbelievably well in this year's exams
Sentiment: 2
Sentence: This product is well designed and easy to use
Sentiment: 3

最佳答案

情绪决策是由训练有素的神经网络做出的。不幸的是,根据您在同一句话中提供的不同名称,它表现得很奇怪,但这是可以预料的。正如 GitHub 上所讨论的,一个因素可能是训练数据中不经常出现各种名称。

关于java - 斯坦福 NLP 情感模糊结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42027119/

相关文章:

java - 在包含 NULL 的 ArrayList 上使用 map {} 时出现 NPE

Java在随机声明数组时出错

javascript - 如何使用 TDD 在 Javascript 中编写闰年算法?

algorithm - 使用递归的可能分类

r - 从作者单位中提取国家名称

java - 如果主题不存在,Kafka 生产者会挂起

Java - 直接访问包含给定值的列表中的对象

regex - 我们什么时候真正使用 Trie 树?

java - 自然语言处理

r - quanteda:按行计算两个 DFM 之间的文本相似度