python pandas对象类型dict获取值时出错

我有一个数据框 df，其中有超过 2000 个具有不同数据类型的列。我计划将这些非数字分类变量转换为数字变量。因此，我需要首先获取这些列名称。

col_dataType = df.columns.to_series().groupby(df.dtypes).groups

col_dataType 是一个包含这 3 个值的字典

col_dataType.keys()
Out: [dtype('O'), dtype('int64'), dtype('float64')]

现在，当我尝试获取具有对象数据类型的列时，即对应于 dtype('O') 的列表

col_dataType["dtype('O')"]

它一直给我键值错误，即使没有双引号它也不起作用。如何获取列名称？

我采用了unutbu的解决方案。

最佳答案

您可以使用df.select_dtypes :

In [58]: df = pd.DataFrame({'foo':[1,2,3], 'bar':['a','b','c'], 'baz':[1.2,3.4,5.6]})

In [59]: df.select_dtypes(exclude=[np.number])
Out[59]: 
  bar
0   a
1   b
2   c

<小时/>

col_dataType 中的键是 numpy.dtype 类型，而不是字符串:

In [67]: [type(item) for item in col_dataType.keys()]
Out[67]: [numpy.dtype, numpy.dtype, numpy.dtype]

所以

In [68]: col_dataType[np.dtype('O')]
Out[68]: ['bar']

有效，但我认为 df.select_dtypes 应该是首选，因为它使用为此目的构建的 Pandas API 方法。

关于python pandas对象类型dict获取值时出错，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30418959/

上一篇：python - 如何在斯坦福中文解析器中不将英语拆分成单独的字母

下一篇：python - 获取字符串模板中所有标识符列表的函数(Python)

相关文章：

python - 自定义类是一个 dict，但没有 dict 副本初始化？

string - 实现字典的最佳数据结构？

python - 将 Pandas 选择分配给变量然后修改它

python - 有没有更快的方法来遍历 DataFrame？

python Pandas : categorize/bin by numeric groupings with zero values

python - Matplotlib 日期索引格式

python /NetworkX : Add Weights to Edges by Frequency of Edge Occurance

python - 从 html 表中获取数据并将其发送到 Pyramid 中的 View

arrays - 如何使用 swift 中函数的参数将元组值放入字典中

python - 在 Jinja2 迭代中获取倒数第二个元素