python - nltk如何给出多个分隔的句子

我有英语句子列表(每个句子都是一个列表)，我想获取 ngram。例如:

sentences = [['this', 'is', 'sentence', 'one'], ['hello','again']]

为了运行

nltk.utils.ngram

我需要将列表扁平化为:

sentences = ['this','is','sentence','one','hello','again']

但是后来我在

中得到了一个错误 bgram

('one','hello')

。最好的处理方法是什么？

谢谢!

最佳答案

试试这个:

from itertools import chain

sentences = list(chain(*sentences))

chain 返回一个链对象，其 .__next__() 方法返回第一个可迭代对象中的元素，直到耗尽，然后返回下一个可迭代对象中的元素可迭代，直到所有可迭代都用完。

或者你可以这样做:

 sentences = [i for s in sentences for i in s]

关于python - nltk如何给出多个分隔的句子，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52606753/