我正在尝试根据下面现有的数据框创建一个新的数据框。我的目标是计算点击次数的平均变化并相应地对事件进行分类。
现有数据框 df:
campaign | date | clicks
A 2015-10-11 255
A 2015-10-12 367
A 2015-10-13 489
B 2015-10-11 500
B 2015-10-15 122
C 2015-10-11 33
目标数据框 df_categorized:
campaign | avg_change | category
A 0.3858 increasing
B -0.756 decreasing
C 0 no change
我试过这段代码,但收到错误消息 TypeError: 'long' object does not support item assignment
#standard packages
import pandas as pd
import numpy as np
#upload data into df
df = pd.read_csv('C:\Users\xxx\Documents\\ad_table.csv')
df.head()
campaign | date | clicks
A 2015-10-11 255
A 2015-10-12 367
A 2015-10-13 489
B 2015-10-11 500
B 2015-10-15 122
C 2015-10-11 33
#create empty dataframe
columns = ['group','avg_change', 'category']
df_categorized = pd.DataFrame(columns=columns)
df_categorized['avg change'] = df.clicks.apply(lambda df: df.pct_change().abs().mean())
#create column
df_categorized['category'] = 0
# going up
df_categorized['category'][df_categorized['avg change'] > 0] = "increasing"
# going down
df_categorized['category'][df_categorized['avg change'] < 0] = "decreasing"
#no change
df_categorized['category'][df_categorized['avg change'] = 0] = "no change"
最佳答案
您可以 groupby
在“事件”上,然后 apply
计算 pct_change
的 lambda
并返回 mean
.然后您可以在此reset_index
并使用np.where
添加额外的类别列。 :
In [239]:
gp = df.groupby('campaign')['clicks'].apply(lambda x: x.pct_change().mean()).reset_index(name='avg_change').fillna(0)
gp['category'] = np.where(gp['avg_change'] < 0, 'decreasing', np.where(gp['avg_change'] > 0, 'increasing', 'no change'))
gp
Out[239]:
campaign avg_change category
0 A 0.38582 increasing
1 B -0.75600 decreasing
2 C 0.00000 no change
这个:
df_categorized['avg change'] = df.clicks.apply(lambda df: df.pct_change().abs().mean())
不会工作,你在列上调用 apply
所以 lambda 将是每个行元素,在这种情况下是一个 int
因此你得到错误:
AttributeError: 'int' object has no attribute 'pct_change'
即使没有这个,它也不会为您提供每个广告系列的 pct_change。
也不要像这样对你的 df 进行链式调用:
df_categorized['category'][df_categorized['avg change'] > 0] = "increasing"
应该是:
df_categorized.loc[df_categorized['avg change'] > 0, 'category'] = "increasing"
参见 docs
关于python - 按组查看趋势变化(python pandas 数据框),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41746571/