假设我在具有列表列表的数据框中有一列:
id pos
0 1 [[['Malaysia','NR'], [':','PU'], ['Natural','JJ'], ['selling price','NN']]]
1 2 [[['Spot Price','NN'], [':','PU'], ['cotton','NN'], ['India', ' NR']]]
或字典格式:[{'id': 1,
'pos': "[[['Malaysia','NR'], [':','PU'], ['Natural','JJ'], ['selling price','NN']]]"},
{'id': 2,
'pos': "[[['Spot Price','NN'], [':','PU'], ['cotton','NN'], ['India', ' NR']]]"}]
如果列表的第二个元素是 NR
,我该如何过滤或 NN
然后 split (爆炸)pos
按行列如下: id words part_of_speech
0 1 Malasia NR
1 1 selling price NN
2 2 Spot price NN
3 2 cotton NN
4 2 India NR
我怎么能在 Python 中做到这一点?谢谢。试用代码:
l = [[['Malaysia','NR'], [':','PU'], ['Natural','JJ'], ['selling price','NN']]]
for elem in l[0]:
print(elem[1])
出去:NR
PU
JJ
NN
最佳答案
你可以用 explode
试试这个:
x = df.explode('pos').explode('pos')
x = x[['id']].reset_index(drop=True).join(pd.DataFrame(x['pos'].tolist()).set_axis(['words', 'part_of_speech'], axis=1))
x.loc[x['part_of_speech'].isin(['NN', 'NR'])]
id words part_of_speech
0 1 Malaysia NR
3 1 selling price NN
4 2 Spot Price NN
6 2 cotton NN
7 2 India NR
>>>
对于具有任意长度的数据帧,此解决方案可以轻松缩放,它不会一一分配列,而是一次分配列。所以它适用于任意长度的子列表。
关于python - 过滤列表列的列表,然后在 Python 中逐行拆分(分解),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69171596/