我有一些数据看起来有点像这样:
data=[([('thing1',
'thing1a'),
('thing1',
'thing1b'),
('thing1',
'thing1c'),
('thing1',
'thing1d'),
('thing1',
'thing1e')],
'thing1description'),
([('thing2',
'thing2a')],
'thing2description'),
([('thing3',
'thing3a')],
'thing3description')]
我想构建一个如下所示的数据框:
thing_number thing_letter description
thing1 thing1a thing1description
thing1 thing1b thing1description
thing1 thing1c thing1description
thing1 thing1d thing1description
thing1 thing1e thing1description
thing2 thing2a thing2description
thing3 thing3a thing3description
感谢之前的一个非常类似的问题,例如 this我可以使用下面的方法来实现它,但我认为我必须遗漏一些东西才能使其更加优雅:
data_=pd.DataFrame(data,columns=['thing','description'])
data_=data_.explode('thing')
data_=pd.concat([data_,pd.DataFrame([(*i, k) for k,j in data for i in k], columns=['thing_number','thing_letter','all'],index=data_.index)],axis=1)
data_=data_[['thing_number','thing_letter','description']]
总而言之,我正在寻找一种更有效、更优雅的方式来解除元组列表的嵌套。提前致谢。
最佳答案
基于相同方法的较短代码:
df = (pd.DataFrame(data, columns=['thing','description'])
.explode('thing',
ignore_index=True) # optional
)
df[['thing_number','thing_letter']] = df.pop('thing').tolist()
输出:
description thing_number thing_letter
0 thing1description thing1 thing1a
1 thing1description thing1 thing1b
2 thing1description thing1 thing1c
3 thing1description thing1 thing1d
4 thing1description thing1 thing1e
5 thing2description thing2 thing2a
6 thing3description thing3 thing3a
关于python - 来自嵌套元组的 Pandas Dataframe,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73956945/