python - 将列表中的元素移动到所需的列

我根据 LDA 模型所做的事情创建了一个列表: lda_model.get_document_topics(bag_of_words)

该模型由7个主题组成，并使用列表理解给出了这个结果:

[v for  v in lda_model_bigram.get_document_topics(bow_corpus_bigram)]

列表到DataFrame

df = pd.DataFrame([[(0, 0.23410834), (1, 0.010244273), (2, 0.010266962), (3, 0.31661528), (4, 0.010282155), (5, 0.010329775), (6, 0.4081532)],
 [(0, 0.24538451), (3, 0.1353473), (6, 0.58342004)],
 [(0, 0.21097288), (1, 0.2306254), (3, 0.5263941)],
 [(0, 0.020534758), (1, 0.02050926), (2, 0.020555891), (3, 0.020502212), (4, 0.57683885), (5, 0.020568976), (6, 0.3204901)],
 [(2, 0.37945262), (4, 0.12737828), (6, 0.47517183)],
])

它看起来像这样:

我的问题是如何根据元组的第一个元素对齐值，使其看起来如下所示:

最佳答案

使用列表理解和嵌套字典理解来获取元组第一个值的键 - 因此在 DataFrame 构造函数值正确对齐后:

L = [[(0, 0.23410834), (1, 0.010244273), (2, 0.010266962), (3, 0.31661528), (4, 0.010282155), (5, 0.010329775), (6, 0.4081532)],
 [(0, 0.24538451), (3, 0.1353473), (6, 0.58342004)],
 [(0, 0.21097288), (1, 0.2306254), (3, 0.5263941)],
 [(0, 0.020534758), (1, 0.02050926), (2, 0.020555891), (3, 0.020502212), (4, 0.57683885), (5, 0.020568976), (6, 0.3204901)],
 [(2, 0.37945262), (4, 0.12737828), (6, 0.47517183)]]

<小时/>

b = [{a: (a, b) for a, b in x} for x in L]

df = pd.DataFrame(b).fillna(0)
print (df)
                  0                 1                 2                 3  \
0   (0, 0.23410834)  (1, 0.010244273)  (2, 0.010266962)   (3, 0.31661528)   
1   (0, 0.24538451)                 0                 0    (3, 0.1353473)   
2   (0, 0.21097288)    (1, 0.2306254)                 0    (3, 0.5263941)   
3  (0, 0.020534758)   (1, 0.02050926)  (2, 0.020555891)  (3, 0.020502212)   
4                 0                 0   (2, 0.37945262)                 0   

                  4                 5                6  
0  (4, 0.010282155)  (5, 0.010329775)   (6, 0.4081532)  
1                 0                 0  (6, 0.58342004)  
2                 0                 0                0  
3   (4, 0.57683885)  (5, 0.020568976)   (6, 0.3204901)  
4   (4, 0.12737828)                 0  (6, 0.47517183)

也可能返回字典列表，因此最后一个 DataFrame 由标量填充(如果需要):

b = [{a: b for a, b in x} for x in L]
df = pd.DataFrame(b).fillna(0)
print (df)
          0         1         2         3         4         5         6
0  0.234108  0.010244  0.010267  0.316615  0.010282  0.010330  0.408153
1  0.245385  0.000000  0.000000  0.135347  0.000000  0.000000  0.583420
2  0.210973  0.230625  0.000000  0.526394  0.000000  0.000000  0.000000
3  0.020535  0.020509  0.020556  0.020502  0.576839  0.020569  0.320490
4  0.000000  0.000000  0.379453  0.000000  0.127378  0.000000  0.475172

关于python - 将列表中的元素移动到所需的列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56626347/

python - 将列表中的元素移动到所需的列

上一篇：python - 如何按文件名过滤QFileSystemModel的文件列表？

下一篇：python - 如何处理 "Incompatible return value type (got overloaded function)"问题