r - R 3.0.0 中 by() 函数的奇怪行为？

我正在尝试熟悉构成 R 的浩瀚宇宙。 by() 有一个出色的函数，它似乎可以满足我的需要，但它似乎不喜欢在数据框中选择多个列。

我使用了标准的 iris 数据集，虽然它在选择单列时表现良好，但似乎不喜欢选择多列。该示例取自引用书，但当然可能存在拼写错误。

第一个版本(有效)

> by(iris[,2],Species,mean)
Species: setosa
[1] 3.428
------------------------------------------------------------ 
Species: versicolor
[1] 2.77
------------------------------------------------------------ 
Species: virginica
[1] 2.974

第二个版本(这不是)

> by(iris[,2:3],Species,mean)
Species: setosa
[1] NA
------------------------------------------------------------ 
Species: versicolor
[1] NA
------------------------------------------------------------ 
Species: virginica
[1] NA
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
3: In mean.default(data[x, , drop = FALSE], ...) :

任何解释都非常感激。

最佳答案

您收到的消息与 by 无关函数而不是 mean .
你通过了data.frame当mean期待一个向量。

相反，如果您要使用在 data.frame 上工作的函数s，则不会引发任何警告:

by(iris[,2:3],iris$Species, colMeans)
by(iris[,2:3],iris$Species, print)
etc

如果需要，可以嵌套*ply类型函数(例如 by 、 tapply 、 lapply 等)。
试试这个例如:

by(iris[,2:3],iris$Species,lapply, mean)

至于mean :

请注意，如果您尝试调用 mean在任何 data.frame 上，它都会提示:

mean(iris[,2:3])
mean(iris[iris$Species==iris$Species[[1]] ,2:3])

使用colMeans反而

colMeans(iris[iris$Species==iris$Species[[1]] ,2:3])

在不相关的说明中:避免使用 attach ;)

关于r - R 3.0.0 中 by() 函数的奇怪行为？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19325399/

r - R 3.0.0 中 by() 函数的奇怪行为？

上一篇：angularjs - 按多个模型过滤

下一篇：rest - SharePoint 2013 REST 语法 : How to Reference Multiple Lookup Field Values