我正在编写一个 Python 文件,需要读取多个不同类型的文件。在使用 f = open("file.txt", "r")
之后,我正在使用传统的 for line in f
逐行读取文件。
这似乎不适用于所有文件。我的猜测是一些文件以不同的编码结尾(例如\r\n 与\r)。我可以读取整个文件并在\r 上进行字符串拆分,但这样做的成本非常高,我宁愿不这样做。有没有办法让 Python 的 readline 方法识别两种行尾变化?
最佳答案
使用通用换行支持——参见 http://docs.python.org/library/functions.html#open
In addition to the standard fopen() values mode may be 'U' or 'rU'. Python is usually built with universal newline support; supplying 'U' opens the file as a text file, but lines may be terminated by any of the following: the Unix end-of-line convention '\n', the Macintosh convention '\r', or the Windows convention '\r\n'. All of these external representations are seen as '\n' by the Python program. If Python is built without universal newline support a mode with 'U' is the same as normal text mode. Note that file objects so opened also have an attribute called newlines which has a value of None (if no newlines have yet been seen), '\n', '\r', '\r\n', or a tuple containing all the newline types seen.
关于python - 让 Python 的 readline 方法识别两种行尾变化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4158645/