python - 如何更新 csv 值，因为它们的行号和列号是 python 已知的

我有一些代码可以查看 .csv 文件的行，并检查这些行是否包含来自另一个 .csv 文件的特定列的任何值。

这些文件看起来像这样:

查找文件:

0:  TextExclude, text, other
1:  aa,        , x ,   y
2:  bb,        , x ,   y
3:  cc,        , x ,   y

我要在其中查找这些值的文件:

0: x, longtext, exclude
1: x, helloaa,  0
2: x, testaa,   0
3: x, testcc,   0
4: x, no,       0
5: x, aabb,     0

我的代码的输出应该将除第 4 行外的每一行中“包含”列的值从 0 更改为 1，从而产生此预期输出 csv 表:

0: x, longtext, exclude
1: x, helloaa,  1
2: x, testaa,   1
3: x, testcc,   1
4: x, no,       0
5: x, aabb,     1

因为我的代码可以输出找到匹配项的行号并且已经定义了列号，所以我想知道解决这个问题的最佳方法是什么并相应地更新 .csv 文件？

这是我的代码:

import pandas
findlist = []
linecount=0
       
with open('lookup.csv', 'r') as f:
    column_names = ["TextExclude", "Exclusion", "Filename"]
    r = pandas.read_csv(f, names=column_names)
    findlist = r.TextExclude.to_list()

with open('datafile.csv', 'r') as f:
    # Skip the first line
    f.readline()
    for line in f: 
        linecount = linecount +1
        if any(listelement in line for listelement in findlist):
            print(line)

最佳答案

您应该使用 pandas 解析 datafile.csv，而不仅仅是将其作为平面文本文件读取。这样您就可以将搜索隔离到正确的列，并更轻松地更新第三列。

import pandas
findlist = []
linecount=0

with open('lookup.csv', 'r') as f:
    column_names = ["TextExclude", "Exclusion", "Filename"]
    r = pandas.read_csv(f, names=column_names)
    findlist = r.TextExclude.to_list()

with open('datafile.csv', 'r') as f:
    df = pandas.read_csv(f)
    for ri, row in df.iterrows():
        if any(x in row[1] for x in findlist):
            row[2] = "1"

# print result to stdout
print(df)

# or write result to a file:
df.to_csv("output.csv")

如果您想知道您使用的是哪个版本的 Pandas 和 Python，您可以运行这两个命令:

Python版本:
python --version

Pandas 版:
python -c "import pandas;print(pandas.__version__)"

这是第二种搜索方式，它可以将步骤分解得更多一些，这可能有助于调试:

import pandas
findlist = []
linecount=0

with open('lookup.csv', 'r') as f:
    column_names = ["TextExclude", "Exclusion", "Filename"]
    r = pandas.read_csv(f, names=column_names)
    findlist = r.TextExclude.to_list()

def search(search_in):
    for to_find in findlist:
        if to_find in search_in:
            return True
    return False

with open('datafile.csv', 'r') as f:
    df = pandas.read_csv(f)
    for ri, row in df.iterrows():
        if search(row[1]):
            row[2] = "1"

# print result to stdout
print(df)

# or write result to a file:
df.to_csv("output.csv")

关于python - 如何更新 csv 值，因为它们的行号和列号是 python 已知的，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70510538/

python - 如何更新 csv 值，因为它们的行号和列号是 python 已知的

上一篇：javascript - 单击 Vuetify 3 后如何取消对按钮的聚焦

下一篇：r - IDE with LaTeX and R support : Inline output in . Rmd notebooks and weaving LaTeX document with R code