python - 如何将 dtype 分类变量转换为数值？

我已经从 age 列创建了 agebin 列。我使用 pd.cut() 创建了 agebin，检查如下:

traindata = data.assign(age_bins =  pd.cut(data.age, 4, retbins=False,labels=range(1, 5), include_lowest=True))

data['agebin'] = traindata['age_bins']

现在，当我看到 data.info 时，agebin 有 dtype category，我希望它是数值数据，因为我在训练模型时遇到值错误。如何将 dtype:category 转换为数字。我很困惑 dtype 是如何分类的，因为当我看到 data['agebin'].head() 所有值都是 1,2,3 或 4 但在 data.info 中它显示 agebin 作为分类。

我想将 agebin 从分类数据类型更改为数字数据类型。

最佳答案

@nimrodz 完美地回答了这个问题。

我只想补充一点，您获得 age_bins 的 category dtype 的原因是 pd.cut 的行为。

out : pandas.Categorical, Series, or ndarray

An array-like object representing the respective bin for each value of x. The type depends on the value of labels.

sequence of scalars : returns a Series for Series x or a pandas.Categorical for all other inputs. The values stored within are whatever the type in the sequence is.

False : returns an ndarray of integers.

如果您设置 labels=False，它将默认为 age_bins 返回一个整数，但是它从 0 开始编号。如果需要，您可以只加一个。

traindata = data.assign(age_bins =  pd.cut(data.age, 4, retbins=False, labels=False, include_lowest=True))
traindata['age_bins'] = traindata.age_bins+1

关于python - 如何将 dtype 分类变量转换为数值？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50888847/

上一篇：python - 使用 glob 获取没有文件扩展名的文件名

下一篇：python - Matlab 到 Python 矩阵代码

相关文章：

python - 如何从 python 3 中的字典值列表中删除 nan 值？

python - get_serving_url 通过 remote_api_shell.py

python - 如何在 Python 中读取、编辑和另存为另一个 excel 文件？

python - 如何将数据帧的每一行传递给数组

python - 为什么通过公共(public)列合并两个 DataFrame 会产生空结果？

python - 将两本词典合二为一

python - 如何存储从请求收到的cookie？

python - 使用 Python Eve Rest 和 Mongo 过滤嵌入式文档

python - "Merging"相同大小的数据帧合并为一个数据帧

具有条件的 Python Pandas : Self-join for running cumulative total,