python - 使用 if 语句从 CSV 文件中删除列

标签 python csv parsing delete-row

(我对 python/编码还很陌生)我正在使用一个程序,该程序会吐出一个包含数千行数据的文本文件,用于我需要绘制的输出。我需要解析前三列数据并删除其余数据。当模拟运行完成后,有时程序会给我三列,有时会给我最多 10 列。我一直在试图找出如何删除我不需要的列。我尝试过for循环,也想过while循环但我不知道该怎么做。

这就是我的 while 循环:

fin = "file location"
fout = "file location"

#delete unused columns
while cols_to_remove in fin >3:
    cols_to_remove = [3, 4, 5] # Column indexes to be removed (starts at 0)
    cols_to_remove = sorted(cols_to_remove, reverse=True) # Reverse so we remove from the end first
    row_count = 0 # Current amount of rows processed
    with open(fin, "r") as source:
        reader = csv.reader(source)
        with open(fout, "w", newline='') as result:
            writer = csv.writer(result)
            for row in reader:
                row_count += 1
                for col_index in cols_to_remove:
                    del row[col_index]
                writer.writerow(row)

我尝试的 if 循环是:

for cols_to_remove in fin == 10:
    cols_to_remove = [3, 4, 5, 6, 7, 8, 9] # Column indexes to be removed (starts at 0)
    cols_to_remove = sorted(cols_to_remove, reverse=True) # Reverse so we remove from the end first
    row_count = 0 # Current amount of rows processed
    reader = csv.reader(source)
    with open(fout, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)
elif cols_to_remove in fin == 9:
    cols_to_remove = [3, 4, 5, 6, 7, 8] # Column indexes to be removed (starts at 0)
    cols_to_remove = sorted(cols_to_remove, reverse=True) # Reverse so we remove from the end first
    row_count = 0 # Current amount of rows processed
    reader = csv.reader(source)
    with open(fout, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)
else:
    break

我不确定我是否走在正确的道路上。如果我使用以 cols_to_remove 开头的代码部分,并在有五列时将其更改为 [3, 4],则效果很好。

这是我从文本文件开始的内容(数据集的前五行):

X [m],Y [m],Z [m],Task #,Pulse #,Pixel X,Pixel Y,Pixel #,Return,Intensity
4.630,-5.078,16.517,0,0,0,30,960,0,0.211
4.937,-4.779,13.969,0,0,2,32,1026,0,0.106
4.630,-4.623,16.366,0,0,0,33,1056,0,0.205
4.937,-4.626,14.418,0,0,2,33,1058,0,0.296
5.087,-4.626,14.868,0,0,3,33,1059,0,0.109

这就是我想要的结果(在 CSV 文件中):

X [m],Y [m],Z [m]
4.630,-5.078,16.517
4.937,-4.779,13.969
4.630,-4.623,16.366
4.937,-4.626,14.418
5.087,-4.626,14.868

最佳答案

我认为您正在以相反的方式思考这个问题,如果您稍后将删除它,为什么要从 csv 中获取列。相反,只获取您需要的列,如下所示:

new_list = []
with open('example.csv') as f:
    # Create reader object
    reader_obj = csv.reader(f)
    
    #Add next(reader_obj) here

    # Iterate over each row in the csv 
    # file using reader object
    for row in reader_obj:
        new_list.append([row[0], row[1], row[2]]) #get the first 3 elements

print(new_list)

更简单的方法

with open('example.csv') as f:
    new_list = [[row[0], row[1], row[2]] for row in csv.reader(f)]

print(new_list)

请注意,这两种方法都将包含标题行。要删除它:在我在第一个代码中注释的位置添加 next() 函数。或者您可以简单地随后弹出新列表的第一个元素。

new_list.pop(0) #Pop (remove) the first element of the list

编辑: 您还可以使用 pandas,简单得多:

df = pd.read_csv('example.csv', usecols=[0,1, 2], header=0)

关于python - 使用 if 语句从 CSV 文件中删除列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75687581/

相关文章:

python - 根据条件将一列中的值替换为另一列中的值

来自 csv 文件的 C++ 2D vector

bash - 将 CSV 文件的第一行全部改为大写

Python-解析结构化文本到Excel

java - 用Gson解析Json以及列表的问题

python - 使用 Python 检查电子邮件是否真实

python - GAE : Best way to determine how many of a Kind is stored?

python - 如何根据python中的Where函数获取两列值

python - python 和 R 中的维基百科哲学游戏图

Python列表实现和pympler测量