python - Pandas dataframe 使用某些条件将一列数据拆分为 2

标签 python pandas dataframe

我有一个数据框在下面-

             0  
    ____________________________________
0     Country| India  
60        Delhi  
62       Mumbai  
68       Chennai  
75    Country| Italy  
78        Rome  
80       Venice  
85        Milan  
88    Country| Australia  
100      Sydney  
103      Melbourne  
107      Perth  

我想将数据分成 2 列,这样一列是国家,另一列是城市。我不知道从哪里开始。我想要像下面一样-

             0                    1
    ____________________________________
0     Country| India           Delhi
1     Country| India           Mumbai
2     Country| India           Chennai         
3    Country| Italy           Rome
4    Country| Italy           Venice   
5    Country| Italy           Milan        
6    Country| Australia       Sydney
7   Country| Australia       Melbourne
8   Country| Australia       Perth     

知道怎么做吗?

最佳答案

查找存在 | 的行并将其拉入另一列,然后填写新创建的列:

(
    df.rename(columns={"0": "city"})
    # this looks for rows that contain '|' and puts them into a 
    # new column called Country. rows that do not match will be
    # null in the new column.
    .assign(Country=lambda x: x.loc[x.city.str.contains("\|"), "city"])
    # fill down on the Country column, this also has the benefit
    # of linking the Country with the City, 
    .ffill()
    # here we get rid of duplicate Country entries in city and Country
    # this ensures that only Country entries are in the Country column
    # and cities are in the City column
    .query("city != Country")
    # here we reverse the column positions to match your expected output 
    .iloc[:, ::-1]
)


      Country           city
60  Country| India      Delhi
62  Country| India      Mumbai
68  Country| India      Chennai
78  Country| Italy      Rome
80  Country| Italy      Venice
85  Country| Italy      Milan
100 Country| Australia  Sydney
103 Country| Australia  Melbourne
107 Country| Australia  Perth

关于python - Pandas dataframe 使用某些条件将一列数据拆分为 2,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64257194/

相关文章:

python - 使用 memmap 从文件加载的 Numpy 数据在格式化输出中使用时抛出异常

python - 如何获取 Pandas 系列中小数点后的最大位数

python - 基于另一个系列的 Pandas 有效分组

python - IF 语句 Pandas Dataframe : The truth value of a Series is ambiguous

python - 检查时间戳列是否在另一个数据帧的日期范围内

python - odoo TreeView 中无法识别的字段

python - 在数据帧中的好坏堆叠条上显示前 3 个 'bad' 系统(按一个值排序,但在图表上显示两个值)

python - 如何在python中控制系统上是否安装了库

python - Groupby 并从组 : Pandas 的最小值中找出差异

python - Pandas - 根据其他列创建总计列