我有一个系列normal_row
,其索引值为:
Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
...
910, 911, 912, 913, 914, 915, 916, 917, 918, 919],
dtype='int64', length=919)
我有一个数据框resultp
resultp.index
返回
Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
...
910, 911, 912, 913, 914, 915, 916, 917, 918, 919],
dtype='int64', length=919)
但是
resultp.loc[14].index
返回
Index([u'1', u'2', u'3', u'4', u'5', u'6', u'7', u'8', u'9', u'10',
...
u'910', u'911', u'912', u'913', u'914', u'915', u'916', u'917', u'918',
u'919'],
dtype='object', length=919)
这会产生问题,因为
resultp.mul(normal_row, axis = 1)
返回一个充满“NaN”值的数据框。数据框的形状也从 (919,919)
变为 (919,1838)
这似乎是因为索引类型在操作过程中发生了变化。这怎么能解决?以及为什么 pandas 不断更改索引类型,索引类型不应该与原始索引保持相同吗?
最佳答案
resultp.loc[14].index
是字符串。当您调用返回索引值为 14
的行的 loc[14]
时。这最终成为一个系列对象,其索引等于 resultp
Index([u'1', u'2', u'3', u'4', u'5', u'6', u'7', u'8', u'9', u'10',
...
u'910', u'911', u'912', u'913', u'914', u'915', u'916', u'917', u'918',
u'919'],
dtype='object', length=919)
这表示列是字符串。
考虑以下对象
idx = pd.RangeIndex(0, 5)
col = idx.astype(str)
resultp = pd.DataFrame(np.random.rand(5, 5), idx, col)
normal_row = pd.Series(np.random.rand(5), resultp.index)
请注意,col
看起来与 idx
相同,但类型为 str
print(resultp)
0 1 2 3 4
0 0.242878 0.995860 0.486782 0.601954 0.500455
1 0.015091 0.173417 0.508923 0.152233 0.673011
2 0.022210 0.842158 0.302539 0.408297 0.983856
3 0.978881 0.760028 0.254995 0.610134 0.247800
4 0.233714 0.401079 0.984682 0.354219 0.816966
print(normal_row)
0 0.778379
1 0.019352
2 0.583937
3 0.227633
4 0.646096
dtype: float64
因为 resultp.columns
是字符串,所以这个乘法返回为 NaN
s
resultp.mul(normal_row, axis=1)
0 1 2 3 4 0 1 2 3 4
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
您需要将 resultp.columns
转换为 int
resultp.columns = resultp.columns.astype(int)
然后相乘
resultp.mul(normal_row, axis=1)
0 1 2 3 4
0 0.305954 0.079327 0.351183 0.588635 0.209578
1 0.136023 0.152232 0.443796 0.493444 0.678651
2 0.411359 0.267142 0.202791 0.327760 0.307422
3 0.399191 0.225889 0.130076 0.147862 0.038032
4 0.039647 0.058929 0.358210 0.684927 0.180250
关于python - Pandas 更改索引数据类型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41729016/