python - 为每个文件分别创建一个新的 txt 文件,其中包含输出和输入文件的大小信息

标签 python pandas dataframe

上面的代码部分很好,但第二部分我试图创建一个新的txt文件,其中包含有关第一部分中创建的文件的信息,对于每个文件分别,例如在此txt文件中将被写入:INPUT FILE1 SIZE IS 42,OUTPUT FILE1 SIZE IS 324,比第二个文件:INPUT FILE2 SIZE IS 62,OUTPUT FILE1 SIZE IS 543...等等

import pandas as pd
import glob
import os

files = glob.glob('*.csv')
for file in files:
    df = pd.read_csv(file, header= None)
    df1 = df.iloc[:, :4].agg(['sum','max','std'])
    df1.columns = range(1, len(df1.columns) + 1)
    s = df1.stack()
    L = ['{} of the {}. column is {}'.format(a, b, c) for (a, b), c in s.items()]
    output_file_name = "output_" + file
    pd.Series(L).to_csv(output_file_name ,index=False) 

#this part is good


for file in files:
    with open(file + "stats.txt", 'a+') as f:
        f.write(' input file size is {}'.format(os.path.getsize(file)))
        f.write('output file size is {}'.format(os.path.getsize(output_file_name)))
    f.close()

最佳答案

使用os.path.splitext对于删除原始文件的扩展名,也不需要 f.close(),因为 with 自动关闭文件:

import glob, os
import pandas as pd

files = glob.glob('*.csv')

#loop by all files
for file in files:
    if not file.startswith(('output_','file_size_')):
        #for write to parameter w
        with open(os.path.splitext(file)[0] + "stats.txt", 'w') as f:
            output_file_name = "output_" + file
            #add both format
            infile = 'SIZE OF INPUT FILE {} IS {}, '.format(file, os.path.getsize(file))
            outfile = 'SIZE OF INPUT FILE {} IS {}'.format(output_file_name, 
                                                           os.path.getsize(output_file_name))

            f.write(infile)
            f.write(outfile)

编辑:

解决方案需要将输入和输出文件的总和添加到变量中:

import glob, os
import pandas as pd

files = glob.glob('*.csv')

input_all, output_all = 0, 0
#loop by all files
for file in files:
    if not (file.startswith('output_') or file.endswith('stats.txt')):
        with open(os.path.splitext(file)[0] + "stats.txt", 'w') as f:
            output_file_name = "output_" + file
            #add both format
            i = os.path.getsize(file)
            o = os.path.getsize(output_file_name)
            input_all += i
            output_all += o
            infile = 'SIZE OF INPUT FILE {} IS {}, '.format(file, i)
            outfile = 'SIZE OF INPUT FILE {} IS {}'.format(output_file_name, o)

            f.write(infile)
            f.write(outfile)


with open("final_stats.txt", 'w') as f:
    instring = 'SIZE OF ALL INPUT FILES IS {}, '.format(input_all)
    outstring = 'SIZE OF ALL OUTPUT FILES IS {}, '.format(output_all)
    both = 'SIZE OF ALL  FILES IS {}'.format(input_all + output_all)
    f.write(instring)
    f.write(outstring)
    f.write(both)

关于python - 为每个文件分别创建一个新的 txt 文件,其中包含输出和输入文件的大小信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54087711/

相关文章:

python - 在 Numpy 中遍历二维线?

python - 如何使用 spaCy 从数据框列创建标记化单词列表?

python - 获取没有索引的数据框列的最后一个值

pandas - 查找范围内的值以便合并或连接

python - 传递多参数函数 pandas dataframe

Python扭曲的irc : Wait for a whois reply inside privmsg method

python - Python 类中的池

python - Pandas 分组将不起作用

r - 清理 R 数据框,以便在列中没有行值大于下一行值的 2 倍

python - 字符串的基本递归?