time_period total_cost total_revenue
7days 150 250
14days 350 600
30days 900 750
7days 180 400
14days 430 620
鉴于此数据,我想将 total_cost 和 total_revenue 列转换为其给定时间段的平均值。我认为这会起作用:
df[['total_cost','total_revenue']][df.time_period]=="7days"]=df[['total_cost','total_revenue']][df.time_period]=="7days"]/7
但它会返回未更改的数据框。
最佳答案
我相信您正在对数据框的副本进行操作。我认为你应该使用 apply
:
from StringIO import StringIO
import pandas
datastring = StringIO("""\
time_period total_cost total_revenue
7days 150 250
14days 350 600
30days 900 750
7days 180 400
14days 430 620
""")
data = pandas.read_table(datastring, sep='\s\s+')
data['total_cost_avg'] = data.apply(
lambda row: row['total_cost'] / float(row['time_period'][:-4]),
axis=1
)
给我:
time_period total_cost total_revenue total_cost_avg
0 7days 150 250 21.428571
1 14days 350 600 25.000000
2 30days 900 750 30.000000
3 7days 180 400 25.714286
4 14days 430 620 30.714286
关于python - 有条件地对 Pandas 数据框执行计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22159805/