作为对象构造函数的一部分,我想按元素减去两个 pandas 数据帧的值:
self.dfload=pd.read_csv(self.name +'/' + 'load.csv')
self.dfload.set_index('snapshot', inplace=True)
load_1 load_2 load_3 load_4
snapshot
2018-01-01 00:00:00 68.248569 91.998188 64.988923 139.535086
2018-01-01 00:15:00 138.274243 127.186259 80.769227 33.509007
2018-01-01 00:30:00 129.824298 56.706114 75.234845 138.610287
2018-01-01 00:45:00 51.754610 45.703056 73.060490 36.913774
2018-01-01 01:00:00 52.129775 139.315283 67.093788 60.488806
和
self.dfsupply=pd.read_csv(self.name + '/' + 'supply.csv')
self.dfsupply.set_index('snapshot', inplace=True)
supply_1 supply_2 supply_3 supply_4
snapshot
2018-01-01 00:00:00 28.448017 45.383377 56.626144 40.044848
2018-01-01 00:15:00 37.534878 29.094980 67.722537 15.002448
2018-01-01 00:30:00 46.163805 28.324557 26.322953 23.250904
2018-01-01 00:45:00 48.192774 55.049855 72.872200 21.602035
2018-01-01 01:00:00 60.499436 53.698329 74.674572 42.425620
由
self.dfresidualLoad=self.dfsupply.subtract(self.dfload, axis='column')
每个元素和两个 dfs 的串联结果都是 NaN:
load_1 load_2 load_3 ... supply_1 supply_2 ...
snapshot
2018-01-01 00:00:00 NaN NaN NaN NaN NaN NaN
.
.
单列相减是没有问题的。不幸的是,这不是理想的解决方案,因为我想保持列数未定义。
最佳答案
如果两个 DataFrame 中相同的列名称和索引减去第二个 DataFrame 创建的 numpy 数组:
self.dfresidualLoad=self.dfsupply - self.dfload.values
或者如果列名称的位置匹配,则使用重命名
列:
d = dict(zip(dfload.columns, dfsupply.columns))
df = dfsupply.subtract(dfload.rename(columns=d), axis='column')
print (df)
supply_1 supply_2 supply_3 supply_4
2018-01-01 00:00:00 -39.800552 -46.614811 -8.362779 -99.490238
2018-01-01 00:15:00 -100.739365 -98.091279 -13.046690 -18.506559
2018-01-01 00:30:00 -83.660493 -28.381557 -48.911892 -115.359383
2018-01-01 00:45:00 -3.561836 9.346799 -0.188290 -15.311739
2018-01-01 01:00:00 8.369661 -85.616954 7.580784 -18.063186
关于python - Pandas 逐个元素地减去两个数据帧的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52479184/