python - NLTK包,未定义标签

标签 python analytics nltk text-analysis

我对 python 很陌生,这是我编写的第一个代码。尝试使用 NLTK 包。当尝试执行 label_probdist.prob('positive') 行时,问题出现在最后。 这是我得到的错误;

name 'label_probdist' is not defined
NameError Traceback (most recent call last)
<ipython-input-57-006d791d4445> in <module>()
----> 1 print label_probdist.prob('positive')
NameError: name 'label_probdist' is not defined






import nltk, re, pprint
import csv

from nltk import word_tokenize, wordpunct_tokenize
from nltk.tokenize import wordpunct_tokenize
from nltk.probability import FreqDist, DictionaryProbDist, ELEProbDist, sum_logs
from nltk.classify.api import ClassifierI

# not in use nltk.download() #Download the bookpackage

#open the file that containts wallposts and classifier
with open('Classified.csv' ,'rb') as f: 
    reader = csv.reader(f)
    FBsocial = map(tuple, reader)

import random
random.shuffle(FBsocial)
FBsocial = FBsocial[:500]
len(FBsocial)

FBSocialData = []   #sorting data
for row in FBsocial:
    statement = row[0]
    sentiment = row[1]
    words_filtered = [e.lower() for e in statement.split() if len(e) >= 3] 
    FBSocialData.append((words_filtered, sentiment))

len(FBSocialData)

#Extracting features of word(list of words ordered by frequency)

def get_words_in_FBdata(FBSocialData):
    all_words = []
    for (statement, sentiment) in FBSocialData:
      all_words.extend(statement)
    return all_words

def get_word_features(wordlist):
    wordlist = nltk.FreqDist(wordlist)
    word_features = wordlist.keys()
    return word_features

word_features = get_word_features(get_words_in_FBdata(FBSocialData))

len(word_features)

#just a test;
document = ("hei","grin","andre","jævlig","gøy",)

#Classifier to decide which feature are relevant

def extract_features(document):
    document_words = set(document)
    features = {}
    for word in word_features:
        features['contains(%s)' % word] = (word in document_words)
    return features

extract_features(document)
#testing extract_features
extract_features("udviser blomsterbutik")

training_set = nltk.classify.util.apply_features(extract_features, FBSocialData)
len(training_set)
classifier = nltk.NaiveBayesClassifier.train(training_set)

def train(labeled_featuresets, estimator=nltk.probability.ELEProbDist):
    # Create the P(label) distribution
    label_probdist = estimator(label_freqdist)
    # Create the P(fval|label, fname) distribution
    feature_probdist = {}
    return NaiveBayesClassifier(label_probdist, feature_probdist)

#pvalue
print label_probdist.prob('positive')
print label_probdist.prob('negative')

最佳答案

您正在函数 train 中定义变量 label_probdist。然后你试图在它的范围之外访问它。这不可能。它是一个局部变量,而不是全局变量。

关于python - NLTK包,未定义标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34153955/

相关文章:

python - 扭曲: `defer.execute` 和 `threads.deferToThread` 之间的区别

python - 从默认 ~/ntlk_data 更改 nltk.download() 路径目录

python - 我需要从 python 中的机器学习模型中打印出超参数和参数

php - 使用谷歌分析跟踪下载?

open-source - 是否有可在 Intranet 上部署的开源/免费站点分析解决方案?

google-analytics - 谷歌分析跨设备跟踪配置

python - NLTK 正则表达式模式中 <NN>* 和 <NN.*>* 之间有什么区别?

python - Verbnet : vn. classids() 返回 2 个列表,但我需要删除其中 1 个

python - 如何增加屏幕尺寸并将其置于显示器中央?

python - 字符串中第二个重复字符的索引