python - 将多个 if/else 语句应用于 pandas 中的 groupby 对象

标签 python pandas group-by

我有一个非常大的 DataFrame,如下所示:

id  amt date
1   0   2010-02-01
1   0   2012-05-12
1   0   2016-08-09
1   20  1970-01-01
2   0   2016-03-21
2   0   2017-11-10
2   0   2012-09-01
2   0   2016-04-15

What I want is to reduce it to one row per id according to following logic:

  1. For a given ID-group: if amt > 0 and date == 1970-01-01 then output row.
  2. For a given ID-group: if amt == 0 for all id rows, output max date for id

I want appearance according to below.

id  amt date
1   20  1970-01-01
2   0   2017-11-10

I have actually solved it through sort and grouping by ID and then taking last(). However, my issue came when I tried to write a function which operates on each separate groupby object and applies the logic i have in point 1 and point 2 above (if/else-style). Can someone help me with this?

Code for DataFrame is below - and please note, the data is large so quick execution is helpful.

Many thanks,

/Swepab

df = pd.DataFrame({'id' : [1, 1, 1, 1, 2, 2, 2, 2]
              ,'amt' : [0, 0, 0, 20, 0 ,0, 0, 0]
              ,'date' : ['2010-02-01', '2012-05-12','2016-08-09'
                       ,'1970-01-01','2016-03-21','2017-11-10'
                       ,'2012-09-01','2016-04-15']})

df['date'] = pd.to_datetime(df.date,format = "%Y-%m-%d")

df = df[['id', 'amt', 'date']]

最佳答案

我编写了一个自定义函数,您可以将其应用于各个组

def custom_fx(df):
if df.amt.sum() == 0:
    max_date = df.date.max()
    return df.loc[df.date==max_date,:]
elif df.amt.sum() != 0 :
     return df[df.date.isin(["1970-01-01"])]

for groups,data in df.groupby("id"):
    print(custom_fx(data))

输出:

     amt       date       id
 3   20       1970-01-01   1
     amt       date       id
 5    0       2017-11-10   2

关于python - 将多个 if/else 语句应用于 pandas 中的 groupby 对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47945097/

相关文章:

python - 使用 API 网关处理 AWS Lambda 错误

python - 如何使用 pandas groupby.filter 保留至少一个值小于 24 的组

python - 如何向 Pandas DataFrame 添加虚拟对象?

sql - 如何将 group-by T-SQL 语句中的所有负数清零

python - 迭代分组数据框中的组

Python 条件异常消息

python - Pyside QPushButton与matplotlib的连接

python - 异常抛出 : read access violation. **bp** 是 0xFFFFFFFFFFFFFFFF

python - 拆分列名并根据列名中的数据创建新列

mysql - 多组Mysql