python - 如何在可能处于通用模式的文件流上使用 io 原语(查找、读取)？

我有一个 file 对象，它可能会或可能不会在通用模式下打开。 (如果有帮助，我可以使用 file.mode 访问此模式)。

我想使用标准的 io 方法处理这个文件:read 和 seek。

如果我以非通用模式打开文件，一切正常:

In [1]: f = open('example', 'r')

In [2]: f.read()
Out[2]: 'Line1\r\nLine2\r\n' # uhoh, this file has carriage returns

In [3]: f.seek(0)

In [4]: f.read(8)
Out[4]: 'Line1\r\nL'

In [5]: f.seek(-8, 1)

In [6]: f.read(8)
Out[6]: 'Line1\r\nL' # as expected, this is the same as before

In [7]: f.close()

但是，如果我以通用模式打开文件，就会出现问题:

In [8]: f = open('example', 'rU')

In [9]: f.read()
Out[9]: 'Line1\nLine2\n' # no carriage returns - thanks, 'U'!

In [10]: f.seek(0)

In [11]: f.read(8)
Out[11]: 'Line1\nLi'

In [12]: f.seek(-8, 1)

In [13]: f.read(8)
Out[13]: 'ine1\nLin' # NOT the same output, as what we read as '\n' was *2* bytes

Python 将\r\n 解释为\n，并返回一个长度为8 的字符串。

但是，创建此字符串需要从文件中读取 9 个字节。

因此，当尝试使用 seek 反转 read 时，我们无法回到开始的地方!

有没有办法确定我们使用了 2 字节的换行符，或者更好的是，禁用此行为？

目前我能想到的最好办法是在阅读前后做一个tell，然后检查我们实际得到了多少，但这似乎令人难以置信不雅。

顺便说一句，在我看来，这种行为实际上与read的文档相反:

In [54]: f.read?
Type:       builtin_function_or_method
String Form:<built-in method read of file object at 0x1a35f60>
Docstring:
read([size]) -> read at most size bytes, returned as a string.

If the size argument is negative or omitted, read until EOF is reached.
Notice that when in non-blocking mode, less data than what was requested
may be returned, even if no size parameter was given.

根据我的阅读，这表明最多 size 个字节应该被读取，而不是返回。

特别是，我认为上述示例的正确语义应该是:

In [11]: f.read(8)
Out[11]: 'Line1\nL' # return a string of length *7*

我是否误解了文档？

最佳答案

你到底想做什么？

如果您向前阅读然后向后查找的原因是您想返回到文件中的特定点，那么请使用 tell() 来记录您所在的位置。这比跟踪您读取了多少字节更容易。

savepos = f.tell()
f.read(8)
f.seek(savepos)
f.read(8)

关于python - 如何在可能处于通用模式的文件流上使用 io 原语(查找、读取)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24466216/

python - 如何在可能处于通用模式的文件流上使用 io 原语(查找、读取)？

上一篇：python - 如何在 Flask/SQLAlchemy 中显示包含多对多查询结果的列

下一篇：python - 为什么 IDLE 3.4 在这个程序上花费这么长时间？