python - pandas 函数基于 dict 创建组合列

我正在尝试在 pandas.DataFrame 中创建一个加权列

我有一个 python 字典，键是 pandas.DataFrame 列名，值是相应的权重。

我想创建一个新列，该列基于 dictionary 和引用 pandas.DataFrame 列值进行加权。

What is an efficient way to do this considering my dictionary configuration will change and contain "misconfiguration" ?

举个例子:

import pandas as pd
import numpy as np
weights = {'IX1' : 0.3, 'IX2' : 0.2, 'IX3' : 0.4, 'IX4' : 0.1}
np.random.seed(0)
df = pd.DataFrame(np.random.randn(10, 3), columns=['IX1', 'IX2', 'IX3'])

##Desired output --- manually combine
df['Composite'] = df['IX1']*0.3 + df['IX2']*0.2 + df['IX3']*0.4

即使 pandas.DataFrame 缺少列，我也希望代码仍然运行

最佳答案

首先为字典中的列和键的相同值创建变量 Index.intersection ，然后选择此列并将矩阵乘法与 dot 一起使用使用 Series from dict 仅针对相同的列进行过滤:

df['Composite'] = df['IX1']*0.3 + df['IX2']*0.2 + df['IX3']*0.4

cols = df.columns.intersection(weights.keys())
df['Composite1'] = df[cols].dot(pd.Series(weights)[cols])
print (df)
        IX1       IX2       IX3  Composite  Composite1
0  1.764052  0.400157  0.978738   1.000742    1.000742
1  2.240893  1.867558 -0.977278   0.654868    0.654868
2  0.950088 -0.151357 -0.103219   0.213468    0.213468
3  0.410599  0.144044  1.454274   0.733698    0.733698
4  0.761038  0.121675  0.443863   0.430192    0.430192
5  0.333674  1.494079 -0.205158   0.316855    0.316855
6  0.313068 -0.854096 -2.552990  -1.098095   -1.098095
7  0.653619  0.864436 -0.742165   0.072107    0.072107
8  2.269755 -1.454366  0.045759   0.408357    0.408357
9 -0.187184  1.532779  1.469359   0.838144    0.838144

关于python - pandas 函数基于 dict 创建组合列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54691337/

python - pandas 函数基于 dict 创建组合列

上一篇：python - 如何处理 pandas 中重复的 "unique identifiers"

下一篇：python docker如何将目录从主机挂载到容器