python - 如何按行重建虚拟数据？

我在 stackoverflow 中阅读了该问题的解决方案，但没有人具体说明何时需要分隔多个列。例如:

输入

输出

movieId genres
1       Adventure|Animation|Children|Comedy|Fantasy
2       Adventure|Children|Fantasy
3       Comedy|Romance
4       Comedy|Drama|Romance
5       Comedy
6       Action|Crime|Thriller
7       Comedy|Romance

我怎样才能用 Pandas 做到这一点？

最佳答案

使用dot列名带有 | 并通过 rstrip 删除最后一个 | :

print (df1)
   movieId  Action  Adventure  Animation  Children  Comedy  Crime  Drama  \
0        1       0          1          1         1       1      0      0   
1        2       0          1          0         1       0      0      0   
2        3       0          0          0         0       1      0      0   
3        4       0          0          0         0       1      0      1   
4        5       0          0          0         0       1      0      0   
5        6       1          0          0         0       0      1      0   
6        7       0          0          0         0       1      0      0   

   Fantasy  Romance  Thriller  
0        1        0         0  
1        1        0         0  
2        0        1         0  
3        0        1         0  
4        0        0         0  
5        0        0         1  
6        0        1         0  

df = df1.set_index('movieId')
df2 = df.dot(df.columns + '|').str.rstrip('|').reset_index(name='genres')

print (df2)
   movieId                                       genres
0        1  Adventure|Animation|Children|Comedy|Fantasy
1        2                   Adventure|Children|Fantasy
2        3                               Comedy|Romance
3        4                         Comedy|Drama|Romance
4        5                                       Comedy
5        6                        Action|Crime|Thriller
6        7                               Comedy|Romance

关于python - 如何按行重建虚拟数据？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53239832/

上一篇：python - Pandas 选择多列然后替换

下一篇：Python Find max in dataframe 列以循环查找所有值

python - 如何允许用户从 Google Cloud Storage 下载图片

python - 使用 python 创建绘图，同时将 excel 数据分离到新文件

python - 使用单元格的位置将单元格替换为值

python - Pandas :用相同重复名称/键组的第一个值填充空值

python - 为从 dict 创建的 pandas 数据框设置名称

python tkinter : displays only a portion of an image

python - 我们能在列表中找到素数吗？

python - Pandas : data frame transformation

python - 根据多个可能的分隔符拆分 DataFrame 中的列