首先,我应该说我已经针对此类问题尝试过可能的解决方案,但找不到最接近的解决方案。我的数据框如下所示
Type Run Status Message
P1 R1 OK
P1 R2 NOK Unable to connect
P1 R3 OK
P1 R4 NOK Unable to fetch
P2 R1 OK
P2 R2 OK
P2 R3 NOK Entry not present
P2 R4 NOK Entry not present
我想将每个“Type”== NOK 的所有唯一错误消息连接到新列
Type Run Status Message Error
P1 R1 OK
P1 R2 NOK Unable to connect Unable to connect, Unable to fetch
P1 R3 OK
P1 R4 NOK Unable to fetch
P2 R1 OK
P2 R2 OK
P2 R3 NOK Entry not present Entry not present
P2 R4 NOK Entry not present
任何线索都会有帮助
最佳答案
您可以使用groupby
操作:
# identify NOK rows
m = df['Status'].eq('NOK')
# get index of first NOK per group
idx = m[m].groupby(df['Type']).cumcount().loc[lambda x: x==0].index
# initialize Error to empty string (optional)
df['Error'] = ''
# concatenate unique error messages per group
# and assign to first NOK per group
df.loc[idx, 'Error'] = (df.groupby('Type')['Message']
.agg(lambda g: ', '.join(dict.fromkeys(g.replace('', pd.NA).dropna())))
.tolist()
)
输出:
Type Run Status Message Error
0 P1 R1 OK None
1 P1 R2 NOK Unable to connect Unable to connect, Unable to fetch
2 P1 R3 OK None
3 P1 R4 NOK Unable to fetch
4 P2 R1 OK None
5 P2 R2 OK None
6 P2 R3 NOK Entry not present Entry not present
7 P2 R4 NOK Entry not present
关于python - 根据 Pandas 中的条件连接多个值将单个列呈现为新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73897901/