python文本文件到行和列

标签 python rows spreadsheet transpose

所以我已经尝试了一段时间,似乎遇到了障碍,需要帮助。

我有几个文本文件。没有写出来这里是一个例子:

2020
Grum Grum
Stamina: 20
Agility: 23
Strength: 20.5%
Resistances: 20-21-30

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

等等等等。有些是这样的,每 6 行开始一个新的统计文件,有些文本文件有它,所以每 10 行就有一个新的统计表。

我的目标是每次统计表结束时,将其放入行和列中。我认为这在电子表格术语中称为转置,但我知道我做错了什么。或者即使这样说是正确的..

例如,我希望文件在完成后看起来像这样。

Year | Name | Stamina | Agility | Str | Res
2020 | Grum Grum | Stamina: 20 | Agility: 23 | Strength: 20.5% | Resistances: 20-21-30

我已经尝试过 Numpy、Pandas 和 idk 我做错了什么,老实说不知道要搜索什么才能找到正确的答案。

如果我能得到任何帮助,我将不胜感激,这些文件非常大,我希望能够具体说明我需要统计表填写的列数。

如果您能提供帮助,在此先感谢您。

最佳答案

您可以试试这个来获取所需的数据框:

with open(r'test1.txt','r') as file:
    data=file.read().split('\n\n')
data=[i.split('\n') for i in data]
df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])

print(df)

输出:

   Year        Name  ...              Str                    Res
0  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-30
1  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-20
2  2020   Grum Grum  ...  Strength: 20.5%  Resistances: 20-21-30
3  2020  Mondo Silo  ...  Strength: 10.5%  Resistances: 20-21-20

要编写具有不同行数且具有相同结构的 .txt 文件列表的数据帧,您可以尝试:

选项1

import pandas as pd

files=['test1.txt','test2.txt']                     #list of files

df=pd.DataFrame(columns=['Year','Name','Stamina','Agility','Str','Res'])  #create the dataframe

for file in files:                                  #we open each file
    with open(r'path_of_files'+file,'r') as file_r:   
        data=file_r.read().strip().split('\n\n')
        data=[i.split('\n') for i in data if i!=''] #get the rows
        print(data)
        s = pd.DataFrame(data, columns=df.columns)  
        df =pd.concat([df, s], ignore_index=True)   #we append the new rows to the dataframe
        
        
print(df)
df.to_csv(r'test3.txt', sep='|', index=False)       #write the final dataframe to the output file('test3.txt'), with '|' as separator 

选项 2

import pandas as pd

files=['test1.txt','test2.txt']                      #list of files

for file in files:                                   #we open each file
    with open(r'path_of_files'+file,'r') as file_r, open(r'test3.txt', 'a') as fout:
        data=file_r.read().strip().split('\n\n')
        data=[i.split('\n') for i in data if i!='']
        df=pd.DataFrame(data,columns=['Year','Name','Stamina','Agility','Str','Res'])   #create a dataframe with the data of the current file
        if files.index(file)==0:
            fout.write(df.to_string( index = False)) #we let header=true to the first iteration to write the columns, and also write the data
        else:
            fout.write(df.to_string(header = False, index = False))  #we write the dataframe without the index and the columns names
        fout.write('\n')                             #a newline to place correctly the next rows

示例
使用下面的一些虚拟文件 (test1.txt,test2.txt),您可以看到包含两个选项的结果 (test3.txt):

test1.txt

2020
Grum Grum
Stamina: 20
Agility: 23
Strength: 20.5%
Resistances: 20-21-30

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

test2.txt

2020
Grum Grum
Stamina: 20
Agility: 23
Strength: 20.5%
Resistances: 20-21-30

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

2020
Mondo Silo
Stamina: 23
Agility: 13
Strength: 10.5%
Resistances: 20-21-20

test3.txt(输出文件)使用选项 1

Year|Name|Stamina|Agility|Str|Res
2020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-30
2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
2020|Grum Grum|Stamina: 20|Agility: 23|Strength: 20.5%|Resistances: 20-21-30
2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20
2020|Mondo Silo|Stamina: 23|Agility: 13|Strength: 10.5%|Resistances: 20-21-20

test3.txt(输出文件)使用选项 2

 Year        Name      Stamina      Agility              Str                    Res
 2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30
 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
 2020   Grum Grum  Stamina: 20  Agility: 23  Strength: 20.5%  Resistances: 20-21-30
 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20
 2020  Mondo Silo  Stamina: 23  Agility: 13  Strength: 10.5%  Resistances: 20-21-20

关于python文本文件到行和列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62617866/

相关文章:

javascript - 通过链式函数调用将元素添加到数组

perl - 如何使用 perl v5.6.1 安装 SpreadSheet::ParseExcel

python - 将字符串计算为值

python - PyMySQL - 转义标识符

python - 计算大量/不精确数据量统计信息的有效方法

numbers - 如何使用 AppleScript 在 Numbers 单元格中设置小数位数?

excel - VBA 电子表格列

python - 访问位于应用程序文件夹上一级目录的 SQLite3 数据库

MYSQL - 如何将值分隔成列 - concat

excel - 如何有效地抑制excel vba中选定列中的所有空行?