我有 3 个 DF:GDP、能源、ScimEn。
print(GDP.index.name) > Country
print(energy.index.name) > None
print(ScimEn.index.name) > None
尽管 energy 和 ScimEn 确实有一个“国家/地区”列。 我想合并“国家”上的所有 DF。我怎样才能做到这一点?我已尝试执行以下操作
newdf = (pd.merge(energy, ScimEn, how='inner', on='Country').merge(GDP, how='inner', on=GDP.index.name))
> KeyError: 'Country'
如果我尝试:
newdf = (pd.merge(energy, ScimEn, how='inner', on='Country').
merge(GDP, how='inner', left_index=True))
raise MergeError('Must pass right_on or right_index=True')
pandas.tools.merge.MergeError: Must pass right_on or right_index=True
如果我尝试:
newdf = (pd.merge(energy, ScimEn, how='inner', on='Country').
merge(GDP, how='inner', left_index=True, right_index=True))
它返回:
Empty DataFrame
Columns: [Country, Energy Supply, Energy Supply per Capita, % Renewable, Rank, Documents, Citable documents, Citations, Self-citations, Citations per document, H index, Country Code, Indicator Name, Indicator Code, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015]
Index: []
最佳答案
您可以添加 reset_index
GDP
:
newdf = pd.merge(energy.reset_index(), ScimEn, on='Country').merge(GDP.reset_index(), on='Country')
如果有很多 DataFrames
则可以选择:
from functools import reduce
dfs = [energy.reset_index(), ScimEn, GDP.reset_index()]
newdf = reduce(lambda left,right: pd.merge(left,right,on='Country'), dfs)
没有 reset_index
的解决方案 join
:
newdf = pd.merge(energy.reset_index(), ScimEn, on='Country').join(GDP, on='Country', how='inner')
关于python - 合并具有不同索引的 3 个数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43931761/