python - 将列表列表转换为数据框

标签 python pandas list numpy dataframe

我有这种格式的数据框。数据帧共有 907 行和 2 列,分别命名为音频和句子。如您所见,音频列包含一系列列表。该列表的总长度为 10000。

Audio                                                     sentence
[[-0.32357552647590637], [-0.4721883237361908],.....],the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock
 [[-0.32357552647590637],[-0.4721883237361908],.....]]the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock


我尝试将列表转换为数据帧,但它分隔了每个字符,这不是我的目标。

aa= pd.DataFrame.from_records(X_tra)   

它做了类似的事情。

0   1   2   3   4   5   6   7   8   9   ...     269990  269991  269992  269993  269994  269995  269996  269997  269998  269999
0   [   [   0   .   0   0   3   9   1   1   ...     None    None    None    None    None    None    None    None    None    None
Audio                                                     sentence
[[-0.32357552647590637], [-0.4721883237361908],.....],the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock
 [[-0.32357552647590637],[-0.4721883237361908],.....]]the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock

以上给出的输出是实际输出。 预期输出如下。

Audio                  Audio1                    sentence
-0.32357552647590637 -0.4721883237361908 ..... the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock
-0.32357552647590637 -0.4721883237361908 ......the kind of them is a relative all the little old lady is it to confide in them and head for buying them hate it consists of a vertical schrock

我想使用此输出来训练神经网络,因此我的句子列将为 Y,其余数据帧将为 X。

最佳答案

这个解决方案怎么样?

import pandas as pd
import numpy as np

data = pd.DataFrame({'Audio':[[[-0.32357552647590637],[-0.4721883237361908]], [[-0.32357552647590637], [-0.4721883237361908]]],
        'sentence':['the kind of them is a relative all the little old', 'More text']})

audios = data.Audio.apply(lambda x: np.ravel(np.array(x))).apply(pd.Series)
audios.columns = ['Audio'+ str(i) for i in range(len(audios.columns))]

audios['sentence'] = data['sentence']

示例数据是:


                  Audio                                    sentence
0   [[-0.32357552647590637], [-0.4721883237361908]] the kind of them is a relative all the little old
1   [[-0.32357552647590637], [-0.4721883237361908]] More text

(在 DF 音频中)结果是:

    Audio0       Audio1      sentence
0   -0.323576   -0.472188   the kind of them is a relative all the little old
1   -0.323576   -0.472188   More text

关于python - 将列表列表转换为数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57906825/

相关文章:

list - Haskel - 列表 <*> 的实现

list - Prolog-如何检查列表是否包含某些元素?

python - python 中的一个命令行可以自动化代码

python - PyQt5虚拟键盘(TypeError : missing 1 required positional argument)

python - 无法安装 Scrapy : "error: command ' gcc' failed with exit status 1"?

javascript - 如何提取 d3 图表的名义标签

python - 如何引用pandas dataframe的索引字段?

python - 从 pyqt4 中的 QTableView 复制/粘贴多个项目?

python - 对列使用 groupby 后计算重复值的实例

c++ - 如何从列表中提取某些值并将其放入另一个列表