python - 如何在 python 中清理文本文件？

我的文件中的文本如下所示:

text1 5,000 6,000
text2 2,000 3,000
text3 
           5,000 3,000
text4 1,000 2000
text5
          7,000 1,000
text6 2,000 1,000

有没有办法在Python中清理这个问题，以便如果文本行后缺少数字，则可以将后续行上的数字放在上面的行上:

text1 5,000 6,000
text2 2,000 3,000
text3 5,000 3,000
text4 1,000 2000
text5 7,000 1,000
text6 2,000 1,000

谢谢!

最佳答案

假设每行应该正好有三个“单词”，您可以使用

tokens = (x for line in open("file") for x in line.split())
for t in zip(tokens, tokens, tokens):
    print str.join(" ", t)

编辑:由于显然上述先决条件不成立，因此这里是一个实际查看数据的实现:

from itertools import groupby
tokens = (x for line in open("file") for x in line.split())
for key, it in groupby(tokens, lambda x: x[0].isdigit()):
    if key:
        print str.join(" ", it)
    else:
        print str.join("\n", it),

关于python - 如何在 python 中清理文本文件？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/5794498/

上一篇：python - 用 python 编写脚本语言

下一篇：python - 如何使用 PyGI 获取带有 Wnck 的窗口列表？

python - 将音频信号分成小样本

python - 为什么我的 glob.glob 循环没有遍历文件夹中的所有文本文件？

Android 文字阴影单位

python - 为什么字典在某些情况下比 collections.Counter 计数更快？

Python正则表达式捕获重用

python - 每隔一定时间执行一个函数，不计算函数执行的时间

javascript - 使用 Phaser，尝试使用 Js 补间文本

Python 正则表达式在论文中获取引用

javascript - 将 span 的额外文本换行到下一行