python - 使用 nltk 通过 n-gram 模型创建新句子

我从文本文件中制作了 2 克和 3 克模型。

from nltk import *
text = open('Alice in Wonderland.txt', 'r').read()
table = string.maketrans('', '')
text = text.translate(table, string.punctuation)
tokens = word_tokenize(text.lower())
bigram = nltk.bigrams(tokens)
trigram = nltk.trigrams(tokens)

但是如何使用这些模型生成新句子？

最佳答案

目前，NLTK 的 generate() 函数已被弃用，因为它已损坏，请参阅 https://github.com/nltk/nltk/issues/1180

但是最先进的替代方案是使用循环神经网络生成文本，例如https://github.com/karpathy/char-rnn (注意:与传统的基于 Ngram 的隐马尔可夫模型不同，char-RNN 不使用 ngram 信息。)

或者，您可以实现自己的隐马尔可夫模型，请参阅 http://fulmicoton.com/posts/shannon-markov/

关于python - 使用 nltk 通过 n-gram 模型创建新句子，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33597485/