python - Unicode解码错误: 'ascii' codec can't decode byte 0xaa in position 2370: ordinal not in range(128)

标签 python string unicode decode encode

我正在用 Python 3.5.3 编写一个脚本,该脚本从一个文件中获取用户名/密码组合并将其写入另一个文件。该脚本是在运行 Windows 10 的计算机上编写并运行的。但是,当我尝试在运行 Yosemite 的 MacBook 上运行该脚本时,出现了与 ASCII 编码有关的错误。

相关函数是这样的:

def buildDatabase():
        print("Building database, this may take some time...")
        passwords = open("10-million-combos.txt", "r") #File with user/pword combos.
        hashWords = open("Hashed Combos.txt", "a") #File where user/SHA-256 encrypted pwords will be stored.
        j = 0
        hashTable = [[ None ] for x in range(60001)] #A hashtable with 30,000 elements, quadratic probing means size must = 2 x the final size + 1
        for line in passwords: 
                toSearch = line 
                i = q = toSearch.find("\t") #The username/pword combos are formatted: username\tpassword\n.
                n = toSearch.find("\n")
                password = line[i:n-1] #i is the start of the password, n is the end of it
                username = toSearch[ :q] + ":" #q is the end of the username
                byteWord = password.encode('UTF-8')
                sha.update(byteWord)
                toWrite = sha.hexdigest() #password is encrypted to UTF-8, run thru SHA-256, and stored in toWrite
                skip = False
                if len(password) == 0: #if le(password) is 0, just skip it
                        skip = True
                if len(password) == 1:
                        doModulo = ord(password[0]) ** 4
                if len(password) == 2:
                        doModulo = ord(password[0]) * ord(password[0]) * ord(password[1]) * ord(password[1])
                if len(password) == 3:
                        doModulo = ord(password[0]) * ord(password[0]) * ord(password[1]) * ord(password[2])
                if len(password) > 3:
                        doModulo = ord(password[0]) * ord(password[1]) * ord(password[2]) * ord(password[3])
                assignment = doModulo % 60001
                #The if block above gives each combo an assignment number for a hash table, indexed by password because they're more unique than usernames
                successful = False
                collision = 0

错误如下:

Traceback (most recent call last):
  File "/Users/connerboehm/Documents/Conner B/PythonFinalProject.py", line 104, in <module>
    buildDatabase()
  File "/Users/connerboehm/Documents/Conner B/PythonFinalProject.py", line 12, in buildDatabase
    for line in passwords:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xaa in position 2370: ordinal not in range(128)

这里发生了什么?我以前在 Windows 上从未遇到过此错误,并且我在尝试编码为 UTF-8 时看不到任何问题。

编辑:记事本以 ANSI 编码。将编码(只需将数据复制并粘贴到新的 .txt 文件)更改为 UTF-8 解决了问题。

最佳答案

您的程序没有说明文件“10-million-combos.txt”中使用的编解码器,因此在本例中Python尝试将其解码为ASCII。 0xaa 不是 ASCII 序数,因此失败。确定文件中使用的编解码器并将其传递到 openencoding 参数中。

关于python - Unicode解码错误: 'ascii' codec can't decode byte 0xaa in position 2370: ordinal not in range(128),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45393709/

相关文章:

python - 根据两列一起过滤行

c# - String.IsNullOrEmpty 未检测到 C# 中的转义序列 NULL\0

python - 将两个字符串与一个公共(public)子字符串连接起来?

javascript - 将 Unicode 转换为 UTF8

unicode - 如何防止从 Perforce unicode 文件中删除 BOM

python - 如何使用分发打包示例脚本?

python - GMPY2 未安装,未找到 mpir.h

python - 将 DataFrames 与 Pks 的所有组合合并

java - 从 JAVA 字符串中消除以特定字符开头的空格和单词

php - 如何从数字中打印原始 UTF-8 字符?