我有一个 33620x160 pandas
DataFrame
,其中有一列包含数字列表。 DataFrame
中的每个列表条目都包含 30 个元素。
df['dlrs_col']
0 [0.048142470608688, 0.047021138711858, 0.04573...
1 [0.048142470608688, 0.047021138711858, 0.04573...
2 [0.048142470608688, 0.047021138711858, 0.04573...
3 [0.048142470608688, 0.047021138711858, 0.04573...
4 [0.048142470608688, 0.047021138711858, 0.04573...
5 [0.048142470608688, 0.047021138711858, 0.04573...
6 [0.048142470608688, 0.047021138711858, 0.04573...
7 [0.048142470608688, 0.047021138711858, 0.04573...
8 [0.048142470608688, 0.047021138711858, 0.04573...
9 [0.048142470608688, 0.047021138711858, 0.04573...
10 [0.048142470608688, 0.047021138711858, 0.04573...
我正在创建一个 33620x30 数组,其条目是单个 DataFrame
列中未列出的值。我目前正在这样做:
np.array(df['dlrs_col'].tolist(), dtype = 'float64')
这工作得很好,但需要花费大量时间,尤其是考虑到我对列表的 6 列进行了类似的计算。关于如何加快速度的任何想法?
最佳答案
你可以这样做:
In [140]: df
Out[140]:
dlrs_col
0 [0.048142470608688, 0.047021138711858, 0.04573]
1 [0.048142470608688, 0.047021138711858, 0.04573]
2 [0.048142470608688, 0.047021138711858, 0.04573]
3 [0.048142470608688, 0.047021138711858, 0.04573]
4 [0.048142470608688, 0.047021138711858, 0.04573]
5 [0.048142470608688, 0.047021138711858, 0.04573]
6 [0.048142470608688, 0.047021138711858, 0.04573]
7 [0.048142470608688, 0.047021138711858, 0.04573]
8 [0.048142470608688, 0.047021138711858, 0.04573]
9 [0.048142470608688, 0.047021138711858, 0.04573]
In [141]: df.dlrs_col.apply(pd.Series)
Out[141]:
0 1 2
0 0.048142 0.047021 0.04573
1 0.048142 0.047021 0.04573
2 0.048142 0.047021 0.04573
3 0.048142 0.047021 0.04573
4 0.048142 0.047021 0.04573
5 0.048142 0.047021 0.04573
6 0.048142 0.047021 0.04573
7 0.048142 0.047021 0.04573
8 0.048142 0.047021 0.04573
9 0.048142 0.047021 0.04573
In [142]: df.dlrs_col.apply(pd.Series).values
Out[142]:
array([[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ],
[ 0.04814247, 0.04702114, 0.04573 ]])
关于python - 加快从列表创建 numpy 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39963357/