python - Pandas :从另一个数据框中逐列相乘?

标签 python pandas

我有两个数据框,都由名为 month 的日期列索引。第一个 df1 有八行。我关心的列是 df['num_percent'],它看起来像这样:

2015-02-01    0.071549
2015-03-01    0.070368
2015-04-01    0.069291
2015-05-01    0.068394
2015-06-01    0.067452
2015-07-01    0.066302
2015-08-01    0.065543
2015-09-01    0.064591
Name: num_percent, dtype: float64

第二个数据框有 100,000 行。我关心的列是 df2['total_quantity'],它的示例如下所示:

2014-11-01    324199
2014-12-01    378443
2015-01-01    367379
2015-02-01    336863
2015-03-01    380268
2015-04-01    386292
2015-05-01    373213
2015-06-01    403343
2015-07-01    414310
2015-08-01    403684
2015-09-01    420922
Name: total_quantity, dtype: int64

我想向 df2 添加一个新列,它是 df2['total_quantity'] 的值乘以 df1 中月份的相应值

我该怎么做?

如果我尝试:

df2['percent'] = df2['total_quantity'] * df1['num_percent']

我收到 ValueError: cannot reindex from a duplicate axis

更新:这里有一些数据和代码来重现这个问题:

data = {'month': ['2014-01-01', '2014-02-01', '2014-03-01'],
        'num_percent': [0.4, 0.5, 0.6]}
df1 = pd.DataFrame(data)
df1['month'] = pd.to_datetime(df1['month'])
df1 = df1.set_index('month')

data = {'month': ['2014-01-01', '2014-02-01', '2014-03-01', '2014-01-01'],
        'org': ['00K', '00K', '00K', '00L'],
        'total_quantity': [1000, 1000, 2000, 1000]}
df2 = pd.DataFrame(data)
df2['month'] = pd.to_datetime(df2['month'])
df2 = df2.set_index('month')

# Both of these produce ValueError: cannot reindex... 
df2['percent'] = df1['num_percent'] * df2['total_quantity']
df2.loc[df2.index.isin(df1.index), 'percent'] = df2['total_quantity'] * df1['num_percent']

最佳答案

如果你join首先是 dfs 然后你可以相乘:

In [24]:
df3 = df1.join(df2)
df3['percent'] = df3['num_percent'] * df3['total_quantity']
df3

Out[24]:
            num_percent  org  total_quantity  percent
month                                                
2014-01-01          0.4  00K            1000      400
2014-01-01          0.4  00L            1000      400
2014-02-01          0.5  00K            1000      500
2014-03-01          0.6  00K            2000     1200

关于python - Pandas :从另一个数据框中逐列相乘?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34853397/

相关文章:

python - 将数据帧python中每一行的前h值乘以k

python - 张量数组 TensorArray_1_0 : Could not read from TensorArray index 0 because it has not yet been written to

python - python 和 C 代码混合有什么好的引用吗?

python Tornado : WSGI module missing?

python - 将列值连接到 Pandas 中的行值

python - Pandas 数据框每天重新采样和计数事件

python - pandas 逐日迭代数据框

python - D 源文件的混合模块和程序行为

python - 在Python中读取Pandas数据框中的 float 时出现问题

python - pandas - 如何将列转换为日期时间对象