python - gensim入门错误: No such file or directory: 'text8'

标签 python python-3.x error-handling gensim word2vec

我正在学习 python 中的 word2vec 和 GloVe 模型,所以我正在研究这个可用的 here .

我在Idle3中一步步编译这些代码后:

>>>from gensim.models import word2vec
>>>import logging
>>>logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
>>>sentences = word2vec.Text8Corpus('text8')
>>>model = word2vec.Word2Vec(sentences, size=200)

我收到这个错误:

2017-01-13 11:15:41,471 : INFO : collecting all words and their counts
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    model = word2vec.Word2Vec(sentences, size=200)
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 469, in __init__
    self.build_vocab(sentences, trim_rule=trim_rule)
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 533, in build_vocab
    self.scan_vocab(sentences, progress_per=progress_per, trim_rule=trim_rule)  # initial survey
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 545, in scan_vocab
    for sentence_no, sentence in enumerate(sentences):
  File "/usr/local/lib/python3.5/dist-packages/gensim/models/word2vec.py", line 1536, in __iter__
    with utils.smart_open(self.fname) as fin:
  File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 127, in smart_open
    return file_smart_open(parsed_uri.uri_path, mode)
  File "/usr/local/lib/python3.5/dist-packages/smart_open-1.3.5-py3.5.egg/smart_open/smart_open_lib.py", line 558, in file_smart_open
    return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'text8'

我该如何纠正这个问题? 预先感谢您的帮助。

最佳答案

您似乎缺少此处使用的文件。具体来说,它正在尝试打开 text8 但找不到它(因此出现 FileNotFoundError)。

您可以从 here 下载文件本身如上所述in the documentation for Text8Corpus :

Docstring:      
Iterate over sentences from the "text8" corpus, unzipped from http://mattmahoney.net/dc/text8.zip .

并使其可用。 提取它,然后将其作为参数提供给 Text8Corpus:

sentences = word2vec.Text8Corpus('/path/to/text8')

关于python - gensim入门错误: No such file or directory: 'text8' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41628856/

相关文章:

python-3.x - 从 Python 中的图像中提取特定颜色范围的简单方法?

c++ - 处理程序是通过 XSetErrorHandler global 还是 threadlocal 设置的?

unix - UNIX中的基本计算器脚本-如果没有输入任何变量,希望使错误消息消失

Python将字符串放入字典

python - Itertools 排列

c# - Linq过滤属性的 setter/getter

python - statsmodel线性回归(ols)的稳健性问题 - Python

python - 将 LSTM Pytorch 模型转换为 ONNX 时遇到问题

python - Flask Mongoengine 文本搜索无法解析字段

python - 在matplotlib trisurf 3d图python上反转y轴