我有一个 pandas 数据框,如下所示:
date | location | occurance <br>
------------------------------------------------------
somedate |united_kingdom_london | 5
somedate |united_state_newyork | 5
我希望它变成
date | country | city | occurance <br>
---------------------------------------------------
somedate | united kingdom | london | 5
---------------------------------------------------
somedate | united state | newyork | 5
我是 Python 新手,经过一番研究,我编写了以下代码,但似乎无法提取国家和城市:
df.location= df.location.replace({'-': ' '}, regex=True)
df.location= df.location.replace({'_': ' '}, regex=True)
temp_location = df['location'].str.split(' ').tolist()
location_data = pd.DataFrame(temp_location, columns=['country', 'city'])
感谢您的回复。
最佳答案
从这里开始:
df = pd.DataFrame({'Date': ['somedate', 'somedate'],
'location': ['united_kingdom_london', 'united_state_newyork'],
'occurence': [5, 5]})
试试这个:
df['Country'] = df['location'].str.rpartition('_')[0].str.replace("_", " ")
df['City'] = df['location'].str.rpartition('_')[2]
df[['Date','Country', 'City', 'occurence']]
Date Country City occurence
0 somedate united kingdom london 5
1 somedate united state newyork 5
借鉴@MaxU的想法
df[['Country'," " , 'City']] = (df.location.str.replace('_',' ').str.rpartition(' ', expand= True ))
df[['Date','Country', 'City','occurence' ]]
Date Country City occurence
0 somedate united kingdom london 5
1 somedate united state newyork 5
关于python - Panda 的数据框将一列拆分为多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38840460/