python - 基本的 Python 问题

标签 python json pandas dataframe

我这样做的方式相当麻烦。我如何能够调整以下代码以使用列表中的国家/地区代码修改链接,并加载这些 json 链接,而不是联合不同的数据帧。

非常感谢您的帮助!

import urllib.request
import json
import pandas as pd
from datetime import datetime

countries = ["nl", "us", "se"]

## load Dutch top episodes chart
with urllib.request.urlopen("https://podcastcharts.byspotify.com/api/charts/top_episodes?    region=nl") as url_NL:
dataFrameNL = json.load(url_NL)
print(dataFrameNL)

## load US top episodes chart
with urllib.request.urlopen("https://podcastcharts.byspotify.com/api/charts/top_episodes?region=us") as url_US:
dataFrameUS = json.load(url_US)
print(dataFrameUS)

# creating the dataframe 
## NL
dfNL = pd.json_normalize(dataFrameNL)
## US
dfUS = pd.json_normalize(dataFrameUS)

## add scraped_date
dfNL['scraped_date'] = pd.Timestamp.today().strftime('%Y-%m-%d')
dfUS['scraped_date'] = pd.Timestamp.today().strftime('%Y-%m-%d')

## add rank
dfNL["rank"] = dfNL.index + 1
dfUS["rank"] = dfNL.index + 1

## add country
dfNL['country'] = 'NL'
dfUS['country'] = 'US'

## concetenate 
union_dataframes = pd.concat([dfNL, dfUS])

## create file name with date output
file_name = 'mycsvfile' + str(datetime.today().strftime('%Y-%m-%d')) + '.csv'

# converted a file to csv
union_dataframes.to_csv(file_name, encoding='utf-8', index=False)

我正在加载不同的数据集并将它们连接起来,而不是在列表上使用循环函数。

最佳答案

创建循环并处理 DataFrame 列表的 country 的每个值,最后一个外部循环通过 concat() 连接在一起:

from pathlib import Path
import pandas as pd

countries = ['nl', 'us', 'se']
url_base = 'https://podcastcharts.byspotify.com/api/charts/top_episodes'
today = pd.Timestamp.today().strftime('%Y-%m-%d')

dfs = []
for country in countries:
    # dynamic set country by f-string
    with urllib.request.urlopen(f'{url_base}?region={country}') as url:
        dataFrame = json.load(url)
    
    df = pd.json_normalize(dataFrame)
    
    # add scraped_date
    df['scraped_date'] = today
    
    # add rank
    df['rank'] = dfNL.index + 1
    
    # add country, dynamic generate uppercase country name
    df['country'] = country.upper()
    dfs.append(df)

# concatenate
union_dataframes = pd.concat(dfs)

# create file name with date output
file_path = Path(f'mycsvfile{today}.csv')

# converted a file to csv
union_dataframes.to_csv(file_path, encoding='utf-8', index=False)

编辑:

from pathlib import Path
import pandas as pd

countries = ['nl', 'us', 'se']

url_base = 'podcastcharts.byspotify.com/api/'
today = pd.Timestamp.today().strftime('%Y-%m-%d')

dfs = []
for country in countries:
    for category in categories:
        # dynamic set country by f-string
        with urllib.request.urlopen(f'{url_base}charts{category}?region={country}') as url:
            dataFrame = json.load(url)
        
        df = pd.json_normalize(dataFrame)
        
        # add scraped_date
        df['scraped_date'] = today
        
        # add rank
        df['rank'] = dfNL.index + 1
        
        # add country, dynamic generate uppercase country name
        df['country'] = country.upper()
        df['category'] = category
        dfs.append(df)

# concatenate
union_dataframes = pd.concat(dfs)

# create file name with date output
file_path = Path(f'mycsvfile{today}.csv')

# converted a file to csv
union_dataframes.to_csv(file_path, encoding='utf-8', index=False)

关于python - 基本的 Python 问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75753527/

相关文章:

python-3.x - 在 Pandas 中重新索引多索引数据帧

python - 如何检查编译vim的python版本?

python - Ruby/Python 中是否有 OpenSource BASIC 解释器?

python - 在 python 中复制(重复)文件

php - 如何从序列化 json 更新 mysql 表?

pandas - 如何在 Pandas 中将字符串分成多行

python - 是否可以覆盖请求中的默认套接字选项?

Python3 : JSON POST Request WITHOUT requests library

java - Apache Camel - GSON JsonSerializer 在路线上使用

python - 在 Pandas 数据框中高效快速地查找和匹配唯一值