python - 如何正确使用 .apply(lambda x :) on dataframe column

标签 python pandas dataframe lambda apply

我遇到的问题是从 df_modified['lat'] = df.coordinates.apply(lambda x: x[0]) 收到错误,它返回错误 TypeError :“float”对象不可下标。由于“坐标”已经是一个列表(请参阅 JSON SNIPPET),我尝试使用 lambda 提取元素 [0] 并将其放置在名为“lat”的新列中,然后放置元素 [1] 在名为“long”的新列中。任何有关此问题的帮助将不胜感激。谢谢!

import pandas as pd
import json
import requests
from pandas.io.json import json_normalize

# READS IN JSON
source = requests.get('www.url.com')
data = json.loads(source.text)

# Flattens the JSON data since it had nested dictionaries
df = pd.io.json.json_normalize(data)

# Renamed "lat_long.coordinates" because the "." was confusing .apply() function
df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)

# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count']]

# Attempt to create a new column "lat" and "long" and place the elemnts accordingly  i.e. [-75.802503,  41.820569]
df_modified['lat'] = df.coordinates.apply(lambda x: x[0])
df_modified['long'] = df.coordinates.apply(lambda x: x[1])

print(df_modified.head(30))

示例 JSON 片段

{
    ":@computed_region_amqz_jbr4": "587",
    ":@computed_region_d3gw_znnf": "18",
    ":@computed_region_nmsq_hqvv": "55",
    ":@computed_region_r6rf_p9et": "36",
    ":@computed_region_rayf_jjgk": "295",
    "arrests": "1",
    "county_code": "44",
    "county_code_text": "44",
    "county_name": "Mifflin",
    "fips_county_code": "087",
    "fips_state_code": "42",
    "incident_count": "1",
    "lat_long": {
      "type": "Point",
      "coordinates": [
        -77.620031,
        40.612749
      ]
    }

最佳答案

你也可以反过来做。在过滤列之前获取latlong

import pandas as pd

import json

with open('sample.json') as infile:
    data = json.load(infile)

df = pd.io.json.json_normalize(data)

df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)
df['lat'] = df['coordinates'].apply(lambda x: x[0])
df['long'] = df['coordinates'].apply(lambda x: x[1])

# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count', 'lat', 
                         'long']]

关于python - 如何正确使用 .apply(lambda x :) on dataframe column,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54679162/

相关文章:

python - 使用长度不均匀的列表项创建 pandas df 列?

python - 根据条件和年份标记 NaN 值

java - 非阻塞 http 服务器,java nio,python tornado eventlet

python - Pandas 替换完整单词字符串

python - 变换计数连续整数

python - 如何将工作表转换为 Pandas 中的数据框?

python - 如何将 django 网站指向 python 版本,除了 centos 上的默认版本

python - 在 Python 中删除不必要的换行符的最有效方法

python - 模块未找到错误 : No module named 'pip.commands'

python - 如何计算 Pandas 中每一列的每日平均值?