python - 使用 Python 操作(非常)长的数据文件

我正在尝试编写一段代码来操作一个很长的文档(超过一百万行)。在这个文本文件中，有固定间隔(每 1003 行)和之间的某些时间戳有我需要的数据，它有 1000 行长，还有一个标题和两个空行，但我不需要。

我希望我的代码从用户处获取 1 到 1000 之间的输入(引用时间戳)并将相应的行 block 复制到单独的 txt 中。

如果输入为“0”，我编写的代码将按预期工作，但如果输入为任何其他数字，则不会提供任何输出。

这是我的代码:

import sys

time = input()

output = open('rho_output_t' + str(time), 'w',)

sys.stdout = output

filepath = 'rho.xg'

l = 2       #lower limit of 0th interval

u = 1001    #upper limit of 0th interval

step = 1003

with open(filepath) as fp:

    for t in range(0, 1000):

        print("{} ".format(t))  #this is only here so I can see the for loop running correctly

        for cnt, line in enumerate(fp):

            if int(time) == t and cnt >= l+(step*int(time)) and cnt <= u+(step*int(time)):

                print("Line {}: {}".format(cnt, line))


output.close()

我哪里搞砸了，该如何纠正？感谢您提前的帮助!

最佳答案

尝试:

with open(filepath) as fp:
    for t in range(0, 1000):
        print("{} ".format(t))  #this is only here so I can see the for loop running correctly
        if int(time) == t:
            for cnt, line in enumerate(fp):
                cnt >= l+(step*int(time)) and cnt <= u+(step*int(time)):
                print("Line {}: {}".format(cnt, line))

这将确保您仅在正确的输入时间时查看 fp 的内容，从而防止它在 t==0 处清空。

关于python - 使用 Python 操作(非常)长的数据文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58822833/

python - 使用 Python 操作(非常)长的数据文件

上一篇：python - openpyxl 检查行是否包含两个单独值的单元格

下一篇：python - 在多个工作人员的支持下，在 gensim 中批量训练 word2vec