python - 相干图空白 - nan 的相干值

标签 python graph nan lda mallet

感谢您的光临。我试图为这张显示为空白的图表寻求一些帮助。我正在关注本教程#17 https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/使用 LDAMallet 构建不同数量主题的连贯性分数图表。这是我的代码:

os.environ['MALLET_HOME'] = 'C:\\mallet\\mallet-2.0.8'

mallet_path = 'C:\\mallet\\mallet-2.0.8\\bin\\mallet'
dictionary = gensim.corpora.Dictionary(processed_docs[:])
bow_corpus = [dictionary.doc2bow(doc) for doc in processed_docs]



def compute_coherence_values(dictionary, bow_corpus, documents, limit, start=2, step=3):
    """
    Compute c_v coherence for various number of topics

    Parameters:
    ----------
    dictionary : Premium Billing data 
    corpus : Gensim bow_corpus
    texts : document
    limit : Max num of topics

    Returns:
    -------
    model_list : List of LDA topic models
    coherence_values : Coherence values corresponding to the LDA model with respective number of topics
    """
    coherence_values = []
    model_list = []
    for num_topics in range(start, limit, step):
        model = gensim.models.wrappers.LdaMallet(mallet_path, corpus=bow_corpus, num_topics=num_topics, id2word=dictionary)
        model_list.append(model)
        coherencemodel = CoherenceModel(model=model, texts=documents, dictionary=dictionary, coherence='c_v')
        coherence_values.append(coherencemodel.get_coherence())

    return model_list, coherence_values
    
# Can take a long time to run.
model_list, coherence_values = compute_coherence_values(dictionary=dictionary, bow_corpus=bow_corpus,
                                                        documents=documents, start=2, limit=40, step=6)
                                                        
# Show graph
limit=40; start=2; step=6;
x = range(start, limit, step)
plt.plot(x, coherence_values)
plt.xlabel("Num Topics")
plt.ylabel("Coherence score")
plt.legend(("coherence_values"), loc='best')
plt.show()

it's drawing a blank and so am I

# Print the coherence scores
for m, cv in zip(x, coherence_values):
    print("Num Topics =", m, " has Coherence Value of", round(cv, 4))

nan like the bread

数据:

dictionary

bow_corpus

print stuff

我希望它看起来像什么:

the dream

请帮忙

最佳答案

我认为问题在于函数 CoherenceModel 中参数“texts”的赋值。我不确定您如何定义传递的值“文档”,但我使用了以下内容:

coherence_model_lda = CoherenceModel(model=lda_model, texts=[tokens], dictionary=dict, coherence='c_v')

我将“标记”定义为单词列表。如果我传递了 texts = tokens,它给了我 nan,然后当我将它传递到上面的列表中时,它工作正常!

关于python - 相干图空白 - nan 的相干值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55816686/

相关文章:

Java内置邻接表

java - 如何使用 JPA 使库对象数据库持久化?

python - 如何可视化 DGL 数据集中的图表?

javascript - updating span是可以的,update input results in 'NaN'

perl - Math::BigFloat 有时会从除法中返回 NaN

python - 使用 Python/Pandas 创建包含多个工作表的 Excel 文件

python - 找不到 '' 的反向。 '' 不是有效的 View 函数或模式名称

Python - 系统错误 : NULL result without error in PyObject call

java - 如何纠正得到 NaN 作为我的结果?

Python Tkinter 界面外观因操作系统而异