python - 如何在 Gensim 的 LdaModel 中记录时代

标签 python python-3.x gensim

我试图在我的 LdaModel 中显示学习进度,但我在网络上找到的每个示例都会引发异常:

l =  gensim.models.callbacks.CoherenceMetric(corpus=common_corpus, logger='shell')
lda = gensim.models.ldamodel.LdaModel(doc_term_matrix, num_topics=genres_count, id2word = common_corpus, passes=150, callbacks=[l])

抛出:

  File "<ipython-input-165-6ad0e2e8516c>", line 2, in <module>
    lda = gensim.models.ldamodel.LdaModel(doc_term_matrix, num_topics=genres_count, id2word = common_corpus, passes=150, callbacks=[l])

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\ldamodel.py", line 371, in __init__
    self.update(corpus, chunks_as_numpy=use_numpy)

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\ldamodel.py", line 750, in update
    current_metrics = callback.on_epoch_end(pass_)

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\callbacks.py", line 288, in on_epoch_end
    value = metric.get_value(topics=topics, model=self.model, other_model=self.previous)

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\callbacks.py", line 105, in get_value
    coherence=self.coherence, topn=self.topn

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\coherencemodel.py", line 190, in __init__
    self.window_size = SLIDING_WINDOW_SIZES[self.coherence]

KeyError: None

此代码(找到 here ):

class EpochLogger(CallbackAny2Vec):
    '''Callback to log information about training'''

    def __init__(self):
        self.epoch = 0

    def on_epoch_begin(self, model):
        print("Epoch #{} start".format(self.epoch))

    def on_epoch_end(self, model):
        print("Epoch #{} end".format(self.epoch))
        self.epoch += 1

l = EpochLogger()
lda = gensim.models.ldamodel.LdaModel(doc_term_matrix, num_topics=genres_count, id2word = common_corpus, passes=150, callbacks=[l])

抛出:

Traceback (most recent call last):

  File "<ipython-input-167-e89e2bf41977>", line 1, in <module>
    lda = gensim.models.ldamodel.LdaModel(doc_term_matrix, num_topics=genres_count, id2word = common_corpus, passes=150, callbacks=[l])

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\ldamodel.py", line 371, in __init__
    self.update(corpus, chunks_as_numpy=use_numpy)

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\ldamodel.py", line 688, in update
    callback.set_model(self)

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\callbacks.py", line 264, in set_model
    if any(metric.logger == "visdom" for metric in self.metrics):

  File "C:\Users\me\AppData\Local\Continuum\anaconda3\lib\site-packages\gensim\models\callbacks.py", line 264, in <genexpr>
    if any(metric.logger == "visdom" for metric in self.metrics):

AttributeError: 'EpochLogger' object has no attribute 'logger'

目前我最感兴趣的是监控学习进度(目测 ETA)。

设置回调的正确方法是什么?

最佳答案

更新自:

l =  gensim.models.callbacks.CoherenceMetric(corpus=common_corpus, logger='shell')

到:

l =  gensim.models.callbacks.CoherenceMetric(corpus=common_corpus, coherence="u_mass", logger='shell')

“u_ma​​ss”只需要一个语料库。

关于python - 如何在 Gensim 的 LdaModel 中记录时代,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55694532/

相关文章:

python - 如何解决tf_serving_entrypoint.sh : line 3: 6 Illegal instruction (core dumped) when using tensorflow/serving image

python - 使用 python 创建 bash 文件

python-3.x - chatgpt-api 和 Web 界面的不同答案

linux - tkinter asksaveasfilename 不适用于文件扩展名中超过 1 个点

python - 如何使用经过训练的 LDA 模型使用 gensim 预测新查询的主题?

python - 将列表与字符串进行比较并在字符串中打印列表的匹配行

python - 连接两个具有不同索引级别数的 MultiIndex DataFrame

python - 我被音频/文本困住了(使用python)

python - 从文档术语矩阵计算前 n 个单词对共现

python - gensim - Word2vec 继续训练现有模型 - AttributeError : 'Word2Vec' object has no attribute 'compute_loss'