python - 使用 pyinstaller 文本无法解码

标签 python python-3.x encoding compilation pyinstaller

我尝试从 .txt 文件中提取文本,但收到错误:

ERROR:root:decode error:
Traceback (most recent call last):
  File "ml_funcs/tokenizer.py", line 15, in extract_text
  File "textract/parsers/__init__.py", line 77, in process
  File "textract/parsers/utils.py", line 46, in process
  File "textract/parsers/txt_parser.py", line 9, in extract
  File "/Users/ivanlavrenov/projects/project/.venv2/lib/python3.7/codecs.py", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 23: invalid start byte

pyinstaller for unknown reasons can't decode text with open(file.txt)

<小时/>

当尝试在其他计算机上启动 .exec 时,texttract 无法解码任何文本。 隐藏导入没有帮助。这是我的 .spec 文件:

# -*- mode: python -*-
import sys
from PyInstaller.utils.hooks import collect_data_files

block_cipher = None

a = Analysis(['main.py'],
             pathex=['/Users/ivanlavrenov/projects/project'],
             binaries=[],
             datas=[],
             hiddenimports=[],
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher)

a.datas += [('messages.properties',
             'messages.properties', 'DATA'), ]

a.datas += Tree('.venv2/lib/python3.7/site-packages/langdetect/profiles/',
                prefix='langdetect/profiles/')

a.datas += Tree('.venv2/lib/python3.7/site-packages/stop_words/stop-words/',
                prefix='stop-words/')

a.datas += Tree('./desktop_app/images/', prefix='desktop_app/images/')

a.hiddenimports.append("textract.parsers")
a.hiddenimports.append("docx2txt")
a.hiddenimports.append("csv")
a.hiddenimports.append("xlrd")
a.hiddenimports.append("chardet")
a.hiddenimports.append("codecs")

print(a.hiddenimports)
pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)

exe = EXE(pyz,
          a.scripts,
          a.binaries,
          a.zipfiles,
          a.datas,
          name='main',
          debug=False,
          strip=False,
          upx=True,
          runtime_tmpdir=None,
          console=True,
          icon='desktop_app/images/icon.icns')

如果有人有任何想法,这将对我非常有帮助)

最佳答案

通过添加编码的chardet检测解决了问题

关于python - 使用 pyinstaller 文本无法解码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52537634/

相关文章:

python - Django:多个外键查找

python - 将 dict 转换为 tweepy 状态

python - 当完全限定对象类与限定类不同时,isinstance() 返回 false

python-3.x - 将 elastic_transport.ObjectApiResponse 转换为 python 响应

mysql - 将 UTF-8 txt 文件插入 MySql 结果出现错误代码 1366

encoding - httpgetrequest uri 编码为 iso-8859-2

python - pandas:日期/值的数据帧 -> "biggest value so far"的数据帧?

python - 如何根据列名属性对 Pandas 数据框进行切片?

python - 如何将多个元素分配给单个列表变量

string - Swift 2.0 转义字符串换行(字符串编码)