python - 基本的 Python 问题

我这样做的方式相当麻烦。我如何能够调整以下代码以使用列表中的国家/地区代码修改链接，并加载这些 json 链接，而不是联合不同的数据帧。

非常感谢您的帮助!

import urllib.request
import json
import pandas as pd
from datetime import datetime

countries = ["nl", "us", "se"]

## load Dutch top episodes chart
with urllib.request.urlopen("https://podcastcharts.byspotify.com/api/charts/top_episodes?    region=nl") as url_NL:
dataFrameNL = json.load(url_NL)
print(dataFrameNL)

## load US top episodes chart
with urllib.request.urlopen("https://podcastcharts.byspotify.com/api/charts/top_episodes?region=us") as url_US:
dataFrameUS = json.load(url_US)
print(dataFrameUS)

# creating the dataframe 
## NL
dfNL = pd.json_normalize(dataFrameNL)
## US
dfUS = pd.json_normalize(dataFrameUS)

## add scraped_date
dfNL['scraped_date'] = pd.Timestamp.today().strftime('%Y-%m-%d')
dfUS['scraped_date'] = pd.Timestamp.today().strftime('%Y-%m-%d')

## add rank
dfNL["rank"] = dfNL.index + 1
dfUS["rank"] = dfNL.index + 1

## add country
dfNL['country'] = 'NL'
dfUS['country'] = 'US'

## concetenate 
union_dataframes = pd.concat([dfNL, dfUS])

## create file name with date output
file_name = 'mycsvfile' + str(datetime.today().strftime('%Y-%m-%d')) + '.csv'

# converted a file to csv
union_dataframes.to_csv(file_name, encoding='utf-8', index=False)

我正在加载不同的数据集并将它们连接起来，而不是在列表上使用循环函数。

最佳答案

创建循环并处理 DataFrame 列表的 country 的每个值，最后一个外部循环通过 concat() 连接在一起:

from pathlib import Path
import pandas as pd

countries = ['nl', 'us', 'se']
url_base = 'https://podcastcharts.byspotify.com/api/charts/top_episodes'
today = pd.Timestamp.today().strftime('%Y-%m-%d')

dfs = []
for country in countries:
    # dynamic set country by f-string
    with urllib.request.urlopen(f'{url_base}?region={country}') as url:
        dataFrame = json.load(url)
    
    df = pd.json_normalize(dataFrame)
    
    # add scraped_date
    df['scraped_date'] = today
    
    # add rank
    df['rank'] = dfNL.index + 1
    
    # add country, dynamic generate uppercase country name
    df['country'] = country.upper()
    dfs.append(df)

# concatenate
union_dataframes = pd.concat(dfs)

# create file name with date output
file_path = Path(f'mycsvfile{today}.csv')

# converted a file to csv
union_dataframes.to_csv(file_path, encoding='utf-8', index=False)

编辑:

from pathlib import Path
import pandas as pd

countries = ['nl', 'us', 'se']

url_base = 'podcastcharts.byspotify.com/api/'
today = pd.Timestamp.today().strftime('%Y-%m-%d')

dfs = []
for country in countries:
    for category in categories:
        # dynamic set country by f-string
        with urllib.request.urlopen(f'{url_base}charts{category}?region={country}') as url:
            dataFrame = json.load(url)
        
        df = pd.json_normalize(dataFrame)
        
        # add scraped_date
        df['scraped_date'] = today
        
        # add rank
        df['rank'] = dfNL.index + 1
        
        # add country, dynamic generate uppercase country name
        df['country'] = country.upper()
        df['category'] = category
        dfs.append(df)

# concatenate
union_dataframes = pd.concat(dfs)

# create file name with date output
file_path = Path(f'mycsvfile{today}.csv')

# converted a file to csv
union_dataframes.to_csv(file_path, encoding='utf-8', index=False)

关于python - 基本的 Python 问题，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/75753527/

python - 基本的 Python 问题

上一篇：tailwind-css - 如何应用 Tailwind Typography 自定义颜色主题

下一篇：reactjs - 通过高阶组件注入(inject) props 会导致编译器错误