python - 如何将文本附加到 'column' 值

我正在尝试在 DataFrame 上运行多次清理。为了跟踪数据发生的情况，我向 DataFrame 添加了一个名为 applied_rules 的列。

在每一步中，如果记录已更新，我都想在 applied_rules 列中添加一行。

通常，它看起来像:

mask = df['type'] == "test"
df.loc[mask, 'value'] = "updated"
df.loc[mask].assign(applied_rules=lambda x: x.applied_rules + "Rule 1 - ...")

全部，applied_rules 返回空。

如果我使用:

mask = df['type'] == "test"
df.loc[mask, 'value'] = "updated"
df[mask]['applied_rules'] += "GR001a - updated position because it was not corresponding to a standard one\n"

仅存储最后一个值。

将文本附加到值的正确方法是什么？

最佳答案

使用DataFrame.loc使用掩码:

df = pd.DataFrame({
         'type':['text',"test",'text',"test","test",'text'],
         'applied_rules':list('aaabbb')
})

mask = df['type'] == "test"
df.loc[mask, 'value'] = "updated"
df.loc[mask, 'applied_rules'] += " GR001a... "

#alternative
#df.loc[mask, 'applied_rules'] = df.loc[mask, 'applied_rules'] + " GR001a... "

print (df)
   type applied_rules    value
0  text             a      NaN
1  test  a GR001a...   updated
2  text             a      NaN
3  test  b GR001a...   updated
4  test  b GR001a...   updated
5  text             b      NaN

关于python - 如何将文本附加到 'column' 值，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58234785/

上一篇：python - 如何在seaborn图表中循环执行子图

下一篇：python - 在 PyQt5 的 TableView 中显示 pandas DataFrame，其中列设置为索引

python - 使用现有实例初始化 super？

python - Pandas 使用 item() 从日期时间索引给出整数

python - 使用 Python 将日期格式转换为另一种格式

python - 使用 Seaborn 绘制 numpy 数组

python - 整数到字节的转换

python - 在具有缺失值的列上计算 sin

python - 生成 python 多处理池时意外的内存占用差异

python - 用模拟替换对象

python - 在 value_counts() 之后从分类中提取索引作为数组