我是一名石油工程师,有一个数据框显示每个油井每月的产油量。它有以下列:
井
:井名。日期
:每月的第一天。Oil_Prod
:当月(日期)的石油产量,以立方米 (m³) 为单位。
>>> df = pd.DataFrame({'Well': ['WellA', 'WellA', 'WellA', 'WellA', 'WellB', 'WellB', 'WellB', 'WellC', 'WellC', 'WellC'], 'Date': ['01/01/2020', '01/02/2020', '01/03/2020', '01/04/2020', '01/02/2020', '01/03/2020', '01/04/2020', '01/01/2015', '01/02/2015', '01/05/2015'], 'Oil_Prod': [1000, 2000, 3000, 3000, 2000, 1500, 1500, 500, 500, 300]})
>>> df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
>>> df.sort_values(['Well', 'Date'], inplace=True)
>>> df
Well Date Oil_Prod
0 WellA 2020-01-01 1000
1 WellA 2020-02-01 2000
2 WellA 2020-03-01 3000
3 WellA 2020-04-01 3000
4 WellB 2020-02-01 2000
5 WellB 2020-03-01 1500
6 WellB 2020-04-01 1500
7 WellC 2015-01-01 500
8 WellC 2015-02-01 500
9 WellC 2015-05-01 300
我必须创建一个新列,其中包含每口井的累计产油量 (Oil_Cum
)。数据框之前按 Well
和 Date
排序。
我尝试使用 for
循环来解决它,但运行时间太长(见下文)。
有没有更快的方法?
>>> for well in df['Well'].unique() :
... filter_well = df['Well'] == well
... df.loc[filter_well, 'Oil_Cum'] = df.loc[filter_well, 'Oil_Prod'].cumsum()
...
>>> df
Well Date Oil_Prod Oil_Cum
0 WellA 2020-01-01 1000 1000.0
1 WellA 2020-02-01 2000 3000.0
2 WellA 2020-03-01 3000 6000.0
3 WellA 2020-04-01 3000 9000.0
4 WellB 2020-02-01 2000 2000.0
5 WellB 2020-03-01 1500 3500.0
6 WellB 2020-04-01 1500 5000.0
7 WellC 2015-01-01 500 500.0
8 WellC 2015-02-01 500 1000.0
9 WellC 2015-05-01 300 1300.0
谢谢!
最佳答案
试试这个:
import pandas as pd
df = pd.DataFrame({'Well': ['WellA', 'WellA', 'WellA', 'WellA', 'WellB', 'WellB', 'WellB', 'WellC', 'WellC', 'WellC'], 'Date': ['01/01/2020', '01/02/2020', '01/03/2020', '01/04/2020', '01/02/2020', '01/03/2020', '01/04/2020', '01/01/2015', '01/02/2015', '01/05/2015'], 'Oil_Prod': [1000, 2000, 3000, 3000, 2000, 1500, 1500, 500, 500, 300]})
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df.sort_values(['Well', 'Date'], inplace=True)
df['Oil_Cum'] = df.groupby(['Well'])['Oil_Prod'].cumsum()
关于python - 如何更快地将 pandas.DataFrame.cumsum() 函数与过滤器一起使用?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64088128/