我希望在下面的代码中添加第二个条件,以在文本之前 append 字符串“Other”。我尝试将其分配给变量并在代码中调用它,但没有成功。这样做的原因是在创建报告时,我们可以在同一可视化中调查所有“其他”,同时不与主要作业列分开
import pandas as pd
import os
os.chdir('/Users/')
df = pd.read_csv("file.csv", encoding = "ISO-8859-1")
df()
Job? Other
Hitman NaN
King NaN
Other Farmer
# Replace all 'Others with values from the other subsequent other column'
#Other columns dropped later on in code.
df.loc[df['Job?'] == 'Other', 'Are you?'] = df['If Other: Job?']
在此之前编写一个 for 语句以便稍后更改和使用切片会更好吗?如果是的话,会是这样的吗?
for row in df.loc(["If Other"], axis=1):
df[row] = df[row].append("other ")
为了更加清晰而进行编辑:
我想要的是结果农民显示为(或接近)
Job
Hitman
King
Other: Farmer
对 jezrael 的进一步编辑:
如果我有多个列,如下所示
Job, Other_1, Position, Other_2, Education, Other_3,
A NaN A NaN A Nan
Other Farmer Other CEO Other Github
#a for loop like the following:
for row in df.loc(["Other_1", "Other_2", "Other_3"], axis=1):
df[row] = df[row].append("other ")
最佳答案
我相信您需要按条件替换带有连接列的值:
df.loc[df['Job?'] == 'Other', 'Job?'] = df['Job?'] + ': ' + df['Other']
或者使用numpy.where
:
df['Job?'] = np.where(df['Job?'] == 'Other', df['Job?'] + ': ' + df['Other'], df['Job?'])
或者使用mask
:
df['Job?'] = df['Job?'].mask(df['Job?'] == 'Other', df['Job?'] + ': ' + df['Other'])
df = df.drop('Other', axis=1)
print (df)
Job?
0 Hitman
1 King
2 Other: Farmer
也可以添加自定义字符串,仅删除df['Job?']
:
df['Job?'] = df['Job?'].mask(df['Job?'] == 'Other', 'ooother: ' + df['Other'])
#last remove column if necessary
df = df.drop('Other', axis=1)
print (df)
Job?
0 Hitman
1 King
2 ooother: Farmer
编辑:
我认为您可以创建列的字典
并循环应用解决方案:
d = {'Job':'Other_1', 'Position':'Other_2', 'Education':'Other_3'}
for k,v in d.items():
df[k] = df[k].mask(df[k] == 'Other', 'other: ' + df[v])
df = df.drop(list(d.values()), axis=1)
print (df)
Job Position Education
0 A A A
1 other: Farmer other: CEO other: Github
关于python - 使用带有 df.loc == 语句的append() Pandas Python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48886849/