r - 确定数据框列的数据类型

我正在使用 R 并已使用 read.csv() 将数据加载到数据框中。如何确定数据框中每一列的数据类型？

最佳答案

最好的选择是使用 ?str() 。为了探索一些示例，让我们制作一些数据:

set.seed(3221)  # this makes the example exactly reproducible
my.data <- data.frame(y=rnorm(5), 
                      x1=c(1:5), 
                      x2=c(TRUE, TRUE, FALSE, FALSE, FALSE),
                      X3=letters[1:5])

@Wilmer E Henao H 的解决方案非常精简:

sapply(my.data, class)
        y        x1        x2        X3 
"numeric" "integer" "logical"  "factor"

使用 str() 可以获得该信息以及额外的好处(例如因子的水平和每个变量的前几个值):

str(my.data)
'data.frame':  5 obs. of  4 variables:
$ y : num  1.03 1.599 -0.818 0.872 -2.682
$ x1: int  1 2 3 4 5
$ x2: logi  TRUE TRUE FALSE FALSE FALSE
$ X3: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5

@Gavin Simpson 的方法也经过了简化，但提供的信息与 class() 略有不同:

sapply(my.data, typeof)
       y        x1        x2        X3 
"double" "integer" "logical" "integer"

有关class、typeof和中间子mode的更多信息，请参阅这个优秀的SO线程:A comprehensive survey of the types of things in R. 'mode' and 'class' and 'typeof' are insufficient 。

关于r - 确定数据框列的数据类型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21125222/

上一篇：intellij-idea - IntelliJ IDEA输出窗口削减输出

下一篇：r - 将图例添加到 ggplot2 线图

相关文章：

R DataFrame - 包含多个术语的列的一种热编码

r - 用分段分布覆盖整体分布图

r - 用串扰过滤两个表

python - 如何用 Pandas 有效地找到两个大数据帧之间的逆交集？

object - Typescript:定义对象的类型

postgresql - 如何创建作为一组枚举值的 PostgreSQL 列？

r - Scala:通过整数列表定义映射

python - 旋转数据框时的列顺序

python - 根据 df1 上的条件以及 df2 或 df3 的报告值创建 pd 系列

c - 请解释一下C语言的数据结构