我有一个包含如下数据的文件:
------------------------------
------------------------------
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
------
++++++
%%RequestHandler
DATA1 = 123456
ERROR1 = 500
DATA2 = 56789
ERROR2 = 505
Count = 4
---
我想创建一个像这样的数据框
最佳答案
使用 pivot
的另一种正则表达式方法:
import re
# or file.read()
out = (pd.DataFrame(re.findall(r'^\s+(\w+)(\d+) = (\d+)', text, flags=re.M))
.pivot(index=1, columns=0, values=2)
.rename_axis(index=None, columns=None)
)
print(out)
输出:
DATA ERROR
1 123456 500
2 56789 505
使用的输入:
text = '''------------------------------
------------------------------
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
<TIME:2020-01-01 01:25:10>
------
++++++
%%RequestHandler
DATA1 = 123456
ERROR1 = 500
DATA2 = 56789
ERROR2 = 505
Count = 4'''
关于Python解析然后放入数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76670562/