我有以下数据框:
df = pd.DataFrame(
{
"customer": ['c1', 'c2', 'c3', 'c4', 'c5'],
"contract_year": [2018, 2020, 2019, 2018, 2019],
"amount": [3000, 1000, 3000, 6000, 6000],
"term": [3, 1, 2, 3, 3]
}
)
customer contract_year amount term
0 c1 2018 3000 3
1 c2 2020 1000 1
2 c3 2019 3000 2
3 c4 2018 6000 3
4 c5 2019 6000 3
我的目标是:对于每个客户,将金额除以“期限”年数;例如: 客户c1,将付款
df["amount"]/df["term"]
从“contract_year”开始的下一个“term”年。这些金额应在每个付款年度的新列中。
最终结果应如下所示:
customer contract_year amount term 2018 2019 2020 2021
0 c1 2018 3000 3 1000 1000 1000
1 c2 2020 1000 1 1000
2 c3 2019 3000 2 1500 1500
3 c4 2018 6000 3 2000 2000 2000
4 c5 2019 6000 3 2000 2000 2000
非常感谢!
最佳答案
让我们这样做:
s = df.reindex(df.index.repeat(df['term']))
s['val'] = s['amount'].floordiv(s['term'])
s['year'] = s['contract_year'] + s.groupby(level=0).cumcount()
s.pivot_table('val', [*df.columns], 'year', aggfunc='first').reset_index()
详细信息:
使用index.repeat
重新索引
数据帧:
print(s)
customer contract_year amount term
0 c1 2018 3000 3
0 c1 2018 3000 3
0 c1 2018 3000 3
1 c2 2020 1000 1
2 c3 2019 3000 2
2 c3 2019 3000 2
3 c4 2018 6000 3
3 c4 2018 6000 3
3 c4 2018 6000 3
4 c5 2019 6000 3
4 c5 2019 6000 3
4 c5 2019 6000 3
将金额
除以期限
,以便在期限
年数之间平均分配金额:
print(s)
customer contract_year amount term val
0 c1 2018 3000 3 1000
0 c1 2018 3000 3 1000
0 c1 2018 3000 3 1000
1 c2 2020 1000 1 1000
2 c3 2019 3000 2 1500
2 c3 2019 3000 2 1500
3 c4 2018 6000 3 2000
3 c4 2018 6000 3 2000
3 c4 2018 6000 3 2000
4 c5 2019 6000 3 2000
4 c5 2019 6000 3 2000
4 c5 2019 6000 3 2000
使用cumcount
为每个level=0
组创建顺序计数器,然后将此计数器添加到contract_year
以生成下一个学期年份:
print(s)
customer contract_year amount term val year
0 c1 2018 3000 3 1000 2018
0 c1 2018 3000 3 1000 2019
0 c1 2018 3000 3 1000 2020
1 c2 2020 1000 1 1000 2020
2 c3 2019 3000 2 1500 2019
2 c3 2019 3000 2 1500 2020
3 c4 2018 6000 3 2000 2018
3 c4 2018 6000 3 2000 2019
3 c4 2018 6000 3 2000 2020
4 c5 2019 6000 3 2000 2019
4 c5 2019 6000 3 2000 2020
4 c5 2019 6000 3 2000 2021
使用pivot_table
reshape 数据框:
year customer contract_year amount term 2018 2019 2020 2021
0 c1 2018 3000 3 1000.0 1000.0 1000.0 NaN
1 c2 2020 1000 1 NaN NaN 1000.0 NaN
2 c3 2019 3000 2 NaN 1500.0 1500.0 NaN
3 c4 2018 6000 3 2000.0 2000.0 2000.0 NaN
4 c5 2019 6000 3 NaN 2000.0 2000.0 2000.0
关于python - 根据现有列中的值计算新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65642320/