python - 拆分列 >> 获取唯一值 >> 将唯一值添加回列

我正在学习 python，并从 Kaggle 获取了数据集，以进一步了解 python 中的数据探索和可视化。

我在数据框中有以下格式的“cuisine”列:

North Indian, Mughlai, Chinese
Chinese, North Indian, Thai
Cafe, Mexican, Italian
South Indian, North Indian
North Indian, Rajasthani
North Indian
North Indian, South Indian, Andhra, Chinese

我想用逗号分割此列并从此列中获取唯一值。我想将这些唯一值作为新列添加回原始数据框。

根据其他帖子，我尝试了以下方法:

1) 隐藏到列表并设置并展平以获得唯一值

Type 函数返回列的系列。将其转换为列表然后设置，会引发错误


type(fl1.cuisines)
pandas.core.series.Series

cuisines_type = fl1['cuisines'].tolist()
type(cuisines_type)
list

cuisines_type
#this returns list of cuisines

cuisines_set = set([ a for b in cuisines_type for a in b])
TypeError: 'float' object is not iterable

2)将其转换为数组和列表

cs = pd.unique(fl1['cuisines'].str.split(',',expand=True).stack())

type(cs)
Out[141]: numpy.ndarray

cs.tolist()

这将返回列表。但我无法删除已添加到某些元素的空格。

预期输出是独特的美食列表，并将其作为列添加回来:

北印度 |莫格莱 |中文

最佳答案

I want to split this column on comma and fetch unique values from this column. Those unique values I want to add back to the original data frame as new columns

a = list(set([i.strip() for i in ','.join(df['cuisine']).split(',')]))

输出

['Thai',
 'Mughlai',
 'Mexican',
 'Rajasthani',
 'Andhra',
 'Chinese',
 'North Indian',
 'Cafe',
 'Italian',
 'South Indian']

使用pd.assign将这些列添加回原始df

df.assign(**{i:0 for i in a})

关于python - 拆分列 >> 获取唯一值 >> 将唯一值添加回列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56899563/

python - 拆分列 >> 获取唯一值 >> 将唯一值添加回列

上一篇：python - 打开多个 Excel 文件，打开每个文件的每个工作表，然后保存图像

下一篇：python - 按成对属性划分的 Pandas 切片