我试图循环多个 JSON 数据,然后将列表中的每个值添加到 DataFrame 中。对于每个 JSON 数据,我创建一个列标题。我似乎总是只获取最后一列的数据,因此我认为附加数据的方式显然有问题。
from pycoingecko import CoinGeckoAPI
cg = CoinGeckoAPI()
df = pd.DataFrame()
timePeriod = 120
for x in range(10):
try:
data = cg.get_coin_market_chart_by_id(id=geckoList[x],
vs_currency ='btc', days = 'timePeriod')
for y in range(timePeriod):
df = df.append({geckoList[x]: data['prices'][y][1]},
ignore_index= True)
print(geckoList[x])
except:
pass
Geckolist 示例:
['bitcoin',
'ethereum',
'xrp',
'bitcoin-cash',
'litecoin',
'binance-coin']
一枚硬币的 JSON 示例:
'prices': [[1565176840078, 0.029035263522626625],
[1565177102060, 0.029079747150763842],
[1565177434439, 0.029128983083947863],
[1565177700686, 0.029136960678700433],
[1565178005716, 0.0290826667213779],
[1565178303855, 0.029173025688296675],
[1565178602640, 0.029204331218623796],
[1565178911561, 0.029211943928343167],
预期的结果将是一个 DataFrame,其中包含每个加密货币的数据列和行。现在只有最后一列显示数据
目前,它看起来像这样:
bitcoin ethereum bitcoin-cash
0 NaN NaN 0.33
1 NaN NaN 0.32
2 NaN NaN 0.21
3 NaN NaN 0.22
4 NaN NaN 0.25
5 NaN NaN 0.26
6 NaN NaN 0.22
7 NaN NaN 0.22
最佳答案
好的,我想我发现了这个问题。
问题是您将仅包含一列的数据结构逐行附加到框架中,因此所有其他列都填充了 NaN
。我认为您想要的是按时间戳连接列。这就是我在下面的示例中所做的。让我知道这是否是您所需要的:
from pycoingecko import CoinGeckoAPI
import pandas as pd
cg = CoinGeckoAPI()
timePeriod = 120
gecko_list = ['bitcoin',
'ethereum',
'xrp',
'bitcoin-cash',
'litecoin',
'binance-coin']
data = {}
for coin in gecko_list:
try:
nested_lists = cg.get_coin_market_chart_by_id(
id=coin, vs_currency='btc', days='timePeriod')['prices']
data[coin] = {}
data[coin]['timestamps'], data[coin]['values'] = zip(*nested_lists)
except Exception as e:
print(e)
print('coin: ' + coin)
frame_list = [pd.DataFrame(
data[coin]['values'],
index=data[coin]['timestamps'],
columns=[coin])
for coin in gecko_list
if coin in data]
df = pd.concat(frame_list, axis=1).sort_index()
df.index = pd.to_datetime(df.index, unit='ms')
print(df)
这让我得到输出
bitcoin ethereum bitcoin-cash litecoin
2019-08-07 12:20:14.490 NaN NaN 0.029068 NaN
2019-08-07 12:20:17.420 NaN NaN NaN 0.007890
2019-08-07 12:20:21.532 1.0 NaN NaN NaN
2019-08-07 12:20:27.730 NaN 0.019424 NaN NaN
2019-08-07 12:24:45.309 NaN NaN 0.029021 NaN
... ... ... ... ...
2019-08-08 12:15:47.548 NaN NaN NaN 0.007578
2019-08-08 12:18:41.000 NaN 0.018965 NaN NaN
2019-08-08 12:18:44.000 1.0 NaN NaN NaN
2019-08-08 12:18:54.000 NaN NaN NaN 0.007577
2019-08-08 12:18:59.000 NaN NaN 0.028144 NaN
[1153 rows x 4 columns]
这是我将天
切换到180时得到的数据。
要获取每日数据,请使用groupby函数
:
df = df.groupby(pd.Grouper(freq='D')).mean()
在 5 天的数据框中,这给了我:
bitcoin ethereum bitcoin-cash litecoin
2019-08-03 1.0 0.020525 0.031274 0.008765
2019-08-04 1.0 0.020395 0.031029 0.008583
2019-08-05 1.0 0.019792 0.029805 0.008360
2019-08-06 1.0 0.019511 0.029196 0.008082
2019-08-07 1.0 0.019319 0.028837 0.007854
2019-08-08 1.0 0.018949 0.028227 0.007593
关于python - 除了最后一列之外,我的数据框都有 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57410576/