我正在尝试为此数据集中的一个变量创建虚拟变量,但是发生了错误,我不知道如何解决它,有任何线索吗?
代码:
df = pd.read_excel(open('DID dataset.xlsx', 'rb'), sheet_name = 'All2')
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
数据: https://gyazo.com/79af7378c4e06c0f36f7f43d03a65119
错误:
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
Traceback (most recent call last):
File "<ipython-input-5-f9cbe04c43a1>", line 1, in <module>
Location_dummy = pd.get_dummies(df['Location'], drop_first=True)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
return self._getitem_column(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
values = self._data.get(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "C:\Users\Michael\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Location'
当我只是输入
时会发生同样的错误df['Location']
由于我能够获得其他变量的虚拟值,因此我的该特定列的 Excel 数据集是否有问题,或者是其他什么问题?
最佳答案
您的代码完全没问题,但问题可能或可能不在列名称中,您的列名称必须有一些前导或尾随空格。 因此,要检查它,请使用:
print("Column headings:")
print(df.columns)
因此,您可以检查 df['Location ']
或 df['location']
来获取列数据,并相应地更改 get_dummies 的代码。
关于python - 无法获取 pandas 数据框中特定变量的虚拟值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54619925/