下面是我正在处理的一部分数据,其中有数千行和其他列。我必须根据“X 列”中的以下条件更改“Y 列”中的值。
如果第 X 列是“第一个”:
细胞#1 = 上皮细胞
细胞#2 = 神经
如果 X 列是“SECOND”:
细胞#1 = 内皮细胞
细胞#2 = 肌肉
数据框:
Column X Column Y
FIRST cell#1
FIRST A
FIRST cell#2
FIRST C
SECOND N
SECOND V
SECOND cell#1
SECOND cell#2
代码:
for row in df['Column X']:
if row == "FIRST":
df.loc[(df['Column Y']== "cell#1"), 'Column Y'] = "epithelial"
df.loc[(df['Column Y']== "cell#2"), 'Column Y'] = "nerve"
elif row == "SECOND":
df.loc[(df['Column Y']== "cell#1"), 'Column Y'] = "endothelial"
df.loc[(df['Column Y']== "cell#2"), 'Column Y'] = "muscle"
else:
pass
我上面的代码不起作用,rows=='FIRST' 的条件适用于整个数据帧,并忽略 rows=='SECOND' 的条件。请帮助。
预期结果:
Column X Column Y
FIRST epithelial
FIRST A
FIRST nerve
FIRST C
SECOND N
SECOND V
SECOND endothelial
SECOND muscle
我上面代码的输出(不正确):
Column X Column Y
FIRST epithelial
FIRST A
FIRST nerve
FIRST C
SECOND N
SECOND V
SECOND epithelial
SECOND nerve
Y 列的最后两行应该是“内皮”和“肌肉”,而不是“上皮”和“神经”
最佳答案
这是一种方式。请注意,不需要循环。许多 pandas
操作都经过矢量化处理,以提高易用性和性能。
import pandas as pd
df = pd.DataFrame([['FIRST', 'cell#1'], ['FIRST', 'A'],
['FIRST', 'cell#2'], ['FIRST', 'C'],
['SECOND', 'N'], ['SECOND', 'V'],
['SECOND', 'cell#1'], ['SECOND', 'cell#2']],
columns=['X', 'Y'])
df.loc[(df.X == 'FIRST') & (df.Y == 'cell#1'), 'Y'] = 'epithelial'
df.loc[(df.X == 'FIRST') & (df.Y == 'cell#2'), 'Y'] = 'nerve'
df.loc[(df.X == 'SECOND') & (df.Y == 'cell#1'), 'Y'] = 'endothelial'
df.loc[(df.X == 'SECOND') & (df.Y == 'cell#2'), 'Y'] = 'muscle'
# X Y
# 0 FIRST epithelial
# 1 FIRST A
# 2 FIRST nerve
# 3 FIRST C
# 4 SECOND N
# 5 SECOND V
# 6 SECOND endothelial
# 7 SECOND muscle
关于python - 应用if语句替换 'Column Y'中不同行对应的 'Column X'中的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48670701/