如果满足所有关联行的条件,我需要为每月与客户关联的所有行分配正确的值(合格或不合格)。
test_data = {'Client Id': [1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2],
'Client Name': ['Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',],
'Week': ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022',
'01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022'],
'Month': ['January', 'January', 'January', 'January',
"February", "February", "February", "February",
'January', 'January', 'January', 'January',
"February", "February", "February", "February"],
'Year': [2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022,
2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
'Payment Status': ["Pending", "Paid in Full", "Didn't Paid", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Pending"]}
test_df = pd.DataFrame(data=test_data)
数据:
Client Id Client Name Week Month Year Payment Status
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending
如果每个月与客户关联的每一行(周)都是全额支付,则将合格分配给每个月与客户关联的所有行(周)。即使 1 周未全额支付(3 周可以全额支付,但 1 周未支付或待定),所有行都分配给不合格。
期望的输出:
Client Id Client Name Week Month Year Payment Status Qualification
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified
我不知道如何实现这个,我在循环中考虑了 value_counts:
for name, month in zip(list(test_df["Client Name"].unique()), list(test_df["Month"])):
print(test_df[(test_df["Client Name"] == name) & (test_df["Month"] == month)].value_counts(["Payment Status"]))
最佳答案
关键是创建一个 bool 掩码:如果 Payment Status
是“Paid in full”,那么 True
否则 False
。现在您可以按 Client Id
、Month
和 Year
进行分组,以检查所有值是否都是 True
。使用 transform
将结果广播到组的每一行。最后,将 True/False
替换为其各自的值。
bool 掩码是通过向数据框添加新列 is_paid
动态创建的:
df['Qualification'] = (
df.assign(is_paid=df['Payment Status'] == 'Paid in Full')
.groupby(['Client Id', 'Month', 'Year'])['is_paid']
.transform('all').replace({True: 'Qualified', False: 'Not Qualified'})
)
print(df)
# Output
Client Id Client Name Week Month Year Payment Status Qualification
0 1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
2 1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
3 1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
4 1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
5 1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
6 1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
7 1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
8 2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
9 2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
10 2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
11 2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
12 2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
13 2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
14 2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
15 2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified
关于python - 每月为与客户关联的所有行分配正确的资格 - Python/Pandas,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71712570/