r - 如何在 R 中使用 dplyr 包重现这个 "apply"示例?

标签 r statistics grouping dplyr

我想使用 pastecs 包中的信息丰富的 stat.desc 函数来按组描述我的数据框中的许多列。让我们将 iris 数据集作为 MWE。 所以我对每一列都这样做:

by(iris$Sepal.Length,list(iris$Species),pastecs::stat.desc,norm = TRUE)
by(iris$Sepal.Width,list(iris$Species),pastecs::stat.desc,norm = TRUE)
by(iris$Petal.Length,list(iris$Species),pastecs::stat.desc,norm = TRUE)
by(iris$Petal.Width,list(iris$Species),pastecs::stat.desc,norm = TRUE)

但是当您有很多列时,这绝对是乏味的,因此您通常希望对此进行矢量化。经过多次试验,我发现了一种使用 applyby() 函数的方法,如下所示:

apply (iris[,1:4],2,function (x) by (x,list (iris$Species),pastecs::stat.desc,norm=TRUE))

list参数是根据哪个组来判断的,norm=TRUE是属于stat.desc的参数,用来描述数据的正态性。

结果

$Sepal.Length
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.30000      5.80000      1.50000    250.30000      5.00000      5.00600      0.04985 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.10018      0.12425      0.35249      0.07041      0.11298      0.16782     -0.45087     -0.34059      0.97770      0.45951 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.90000      7.00000      2.10000    296.80000      5.90000      5.93600      0.07300 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.14669      0.26643      0.51617      0.08696      0.09914      0.14727     -0.69391     -0.52418      0.97784      0.46474 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.90000      7.90000      3.00000    329.40000      6.50000      6.58800      0.08993 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.18071      0.40434      0.63588      0.09652      0.11103      0.16493     -0.20326     -0.15354      0.97118      0.25831 

$Sepal.Width
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.30000      4.40000      2.10000    171.40000      3.40000      3.42800      0.05361 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.10773      0.14369      0.37906      0.11058      0.03873      0.05753      0.59595      0.45018      0.97172      0.27153 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.00000      3.40000      1.40000    138.50000      2.80000      2.77000      0.04438 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.08918      0.09847      0.31380      0.11328     -0.34136     -0.50708     -0.54932     -0.41495      0.97413      0.33800 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.20000      3.80000      1.60000    148.70000      3.00000      2.97400      0.04561 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.09165      0.10400      0.32250      0.10844      0.34428      0.51141      0.38038      0.28734      0.96739      0.18090 

$Petal.Length
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.00000      1.90000      0.90000     73.10000      1.50000      1.46200      0.02456 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.04935      0.03016      0.17366      0.11879      0.10010      0.14869      0.65393      0.49397      0.95498      0.05481 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      3.00000      5.10000      2.10000    213.00000      4.35000      4.26000      0.06646 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.13355      0.22082      0.46991      0.11031     -0.57060     -0.84760     -0.19026     -0.14372      0.96600      0.15848 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.50000      6.90000      2.40000    277.60000      5.55000      5.55200      0.07805 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.15685      0.30459      0.55189      0.09940      0.51692      0.76785     -0.36512     -0.27581      0.96219      0.10978 

$Petal.Width
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
   5.000e+01    0.000e+00    0.000e+00    1.000e-01    6.000e-01    5.000e-01    1.230e+01    2.000e-01    2.460e-01    1.490e-02 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
   2.995e-02    1.111e-02    1.054e-01    4.284e-01    1.180e+00    1.752e+00    1.259e+00    9.508e-01    7.998e-01    8.659e-07 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.00000      1.80000      0.80000     66.30000      1.30000      1.32600      0.02797 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.05620      0.03911      0.19775      0.14913     -0.02933     -0.04357     -0.58731     -0.44365      0.94763      0.02728 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.40000      2.50000      1.10000    101.30000      2.00000      2.02600      0.03884 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.07805      0.07543      0.27465      0.13556     -0.12181     -0.18094     -0.75396     -0.56953      0.95977      0.08695 

R> apply (iris[,1:4],2,function (x,y=iris$Species) by (x,list (y),pastecs::stat.desc,norm=TRUE))
$Sepal.Length
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.30000      5.80000      1.50000    250.30000      5.00000      5.00600      0.04985 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.10018      0.12425      0.35249      0.07041      0.11298      0.16782     -0.45087     -0.34059      0.97770      0.45951 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.90000      7.00000      2.10000    296.80000      5.90000      5.93600      0.07300 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.14669      0.26643      0.51617      0.08696      0.09914      0.14727     -0.69391     -0.52418      0.97784      0.46474 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.90000      7.90000      3.00000    329.40000      6.50000      6.58800      0.08993 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.18071      0.40434      0.63588      0.09652      0.11103      0.16493     -0.20326     -0.15354      0.97118      0.25831 

$Sepal.Width
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.30000      4.40000      2.10000    171.40000      3.40000      3.42800      0.05361 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.10773      0.14369      0.37906      0.11058      0.03873      0.05753      0.59595      0.45018      0.97172      0.27153 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.00000      3.40000      1.40000    138.50000      2.80000      2.77000      0.04438 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.08918      0.09847      0.31380      0.11328     -0.34136     -0.50708     -0.54932     -0.41495      0.97413      0.33800 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      2.20000      3.80000      1.60000    148.70000      3.00000      2.97400      0.04561 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.09165      0.10400      0.32250      0.10844      0.34428      0.51141      0.38038      0.28734      0.96739      0.18090 

$Petal.Length
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.00000      1.90000      0.90000     73.10000      1.50000      1.46200      0.02456 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.04935      0.03016      0.17366      0.11879      0.10010      0.14869      0.65393      0.49397      0.95498      0.05481 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      3.00000      5.10000      2.10000    213.00000      4.35000      4.26000      0.06646 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.13355      0.22082      0.46991      0.11031     -0.57060     -0.84760     -0.19026     -0.14372      0.96600      0.15848 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      4.50000      6.90000      2.40000    277.60000      5.55000      5.55200      0.07805 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.15685      0.30459      0.55189      0.09940      0.51692      0.76785     -0.36512     -0.27581      0.96219      0.10978 

$Petal.Width
: setosa
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
   5.000e+01    0.000e+00    0.000e+00    1.000e-01    6.000e-01    5.000e-01    1.230e+01    2.000e-01    2.460e-01    1.490e-02 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
   2.995e-02    1.111e-02    1.054e-01    4.284e-01    1.180e+00    1.752e+00    1.259e+00    9.508e-01    7.998e-01    8.659e-07 
------------------------------------------------------------------------------------------------------ 
: versicolor
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.00000      1.80000      0.80000     66.30000      1.30000      1.32600      0.02797 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.05620      0.03911      0.19775      0.14913     -0.02933     -0.04357     -0.58731     -0.44365      0.94763      0.02728 
------------------------------------------------------------------------------------------------------ 
: virginica
     nbr.val     nbr.null       nbr.na          min          max        range          sum       median         mean      SE.mean 
    50.00000      0.00000      0.00000      1.40000      2.50000      1.10000    101.30000      2.00000      2.02600      0.03884 
CI.mean.0.95          var      std.dev     coef.var     skewness     skew.2SE     kurtosis     kurt.2SE   normtest.W   normtest.p 
     0.07805      0.07543      0.27465      0.13556     -0.12181     -0.18094     -0.75396     -0.56953      0.95977      0.08695 

问题

如何使用 dplyr 包重现这些结果?

我失败的尝试是:

iris %>%
  group_by (Species) %>%
  summarise_each(funs(pastecs::stat.desc,norm=TRUE))

最佳答案

这是一个使用 dplyr

的选项
library(pastecs)
library(dplyr)
res <- iris %>% 
          group_by(Species) %>% 
          do(data.frame(lapply(.[setdiff(names(.), 'Species')],
                           stat.desc, norm = TRUE))) %>%
          mutate(measure = names(stat.desc(Sepal.Length, norm = TRUE)))

编辑:添加了对应于 stat.descnames(基于@Jaap 的建议)

关于r - 如何在 R 中使用 dplyr 包重现这个 "apply"示例?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35817219/

相关文章:

Mysql group by产生一行

r - 按 R 中的数字对数据框中的行进行分组和标签

xslt - sharepoint dataview webpart xslt 按列的左侧字符分组

r - 如何从 2 个向量中创建一个对列表

r - R 中的大矩阵 : long vectors not supported yet

python - 使用 Scipy 的 stats.kstest 模块进行拟合优度测试

python - 程序错误和有关最大的问题。对数似然

r - str_replace "NA"的异常行为

r - R 中矩阵列表的标准偏差

statistics - 1 对 1 投票 : calculate ratings (Flickchart. com)