我正在尝试获取一个数据表,仅从世界银行 API 中获取国家/地区、年份和值,但我似乎无法仅筛选出我想要的数据。我发现已经有人提出过此类问题,但所有答案似乎都不起作用。
非常感谢一些帮助。谢谢!
import requests
import pandas as pd
from bs4 import BeautifulSoup
import json
url ="http://api.worldbank.org/v2/country/{}/indicator/NY.GDP.PCAP.CD?date=2015&format=json"
country = ["DZA","AGO","ARG","AUS","AUT","BEL","BRA","CAN","CHL","CHN","COL","CYP", "CZE","DNK","FIN","FRA","GEO","DEU",
"GRC""HUN","ISL","IND","IDN","IRL","ISR","ITA","JPN","KAZ","KWT","LBN","LIE","MYS","MEX","MCO","MAR","NPL","NLD",
"NZL","NGA","NOR","OMN","PER","PHL","POL","PRT","QAT","ROU","SGP","ZAF","ESP","SWE","CHE","TZA","THA","TUR","UKR",
"GBR","USA","VNM","ZWE"]
html={}
for i in country:
url_one = url.format(i)
html[i] = requests.get(url_one).json()
my_values=[]
for i in country:
value=html[i][1][0]['value']
my_values.append(value)
编辑
我的数据目前如下所示,我正在尝试提取 '{'country': {'id': 'AO', 'value': 'Angola''} 中的国家/地区名称,以及 'date ”和“值(value)”
最佳答案
注意: 假设一次存储所有年份的信息而不仅仅是一年的信息会很好 - 使您能够在以后的处理中简单地进行过滤。看一下,你们国家之间少了一个“,”"GRC""HUN"
实现您的目标有不同的选择,只需将其中两个指向正确的方向即可。
选项#1
从 json 响应中选择所需的信息,创建一个 reshape 的字典并 append()
到my_values
:
for d in data[1]:
my_values.append({
'country':d['country']['value'],
'date':d['date'],
'value':d['value']
})
示例
import requests
import pandas as pd
url = 'http://api.worldbank.org/v2/country/%s/indicator/NY.GDP.PCAP.CD?format=json'
countries = ["DZA","AGO","ARG","AUS","AUT","BEL","BRA","CAN","CHL","CHN","COL","CYP", "CZE","DNK","FIN","FRA","GEO","DEU",
"GRC","HUN","ISL","IND","IDN","IRL","ISR","ITA","JPN","KAZ","KWT","LBN","LIE","MYS","MEX","MCO","MAR","NPL","NLD",
"NZL","NGA","NOR","OMN","PER","PHL","POL","PRT","QAT","ROU","SGP","ZAF","ESP","SWE","CHE","TZA","THA","TUR","UKR",
"GBR","USA","VNM","ZWE"]
my_values = []
for country in countries:
data = requests.get(url %country).json()
try:
for d in data[1]:
my_values.append({
'country':d['country']['value'],
'date':d['date'],
'value':d['value']
})
except Exception as err:
print(f'[ERROR] country ==> {country} error ==> {err}')
pd.DataFrame(my_values).sort_values(['country', 'date'], ascending=True)
选项#2
直接从 json 响应创建数据帧,将它们连接起来并对最终数据帧进行一些调整:
for d in data[1]:
my_values.append(pd.DataFrame(d))
...
pd.concat(my_values).loc[['value']][['country','date','value']].sort_values(['country', 'date'], ascending=True)
输出
关于python - 使用 pandas 从世界银行 API 获取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70575998/