r - R中按行引导/重采样矩阵

我有一个 20 行 10 列的矩阵 x。我需要一次采样(带替换)5 行并计算列平均值。我需要重复此过程 15 次并报告每次的列平均值。

作为示例，我使用 R 中的重新采样库来执行此操作。

# Create a random matrix
library("resample")

set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)

## Bootstrap 15 times by re sampling 5 rows at a time. 
k <- bootstrap(x,colMeans,B = 15,block.size=5)

我对上述过程的担忧是，我不确定行是否保持“完好”，这意味着列均值是在选定的 5 行内计算的。第二个问题是，上述函数中的 block.size 是否随机选择 5 行并计算 colMeans 并重复此操作 15 次，并在重复中报告，如下所示？

 k$replicates
      stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9 stat10
 [1,]  4.65  4.50  4.65  5.25  5.25  5.05  4.90  5.60  4.85   5.20
 [2,]  4.60  4.65  4.80  5.60  5.50  5.20  5.05  5.10  5.00   5.40
 [3,]  4.90  4.35  4.55  5.20  5.80  4.80  4.60  5.30  5.15   4.20
 [4,]  4.75  4.65  4.15  5.30  5.25  4.80  4.70  5.15  5.55   4.35
 [5,]  4.55  4.65  4.50  5.40  5.40  4.90  4.85  5.55  5.00   4.75
 [6,]  4.65  4.25  5.00  5.35  5.20  5.05  4.95  5.20  4.75   5.20
 [7,]  4.70  4.30  4.75  5.35  5.50  4.75  5.00  5.45  4.85   4.75
 [8,]  4.75  4.15  4.95  5.10  5.55  4.70  4.70  5.30  5.05   4.90
 [9,]  4.40  4.30  4.50  5.25  5.50  4.70  4.75  5.35  4.95   4.85
[10,]  4.85  4.50  4.35  5.25  5.70  4.75  4.65  5.35  4.95   4.10
[11,]  4.35  4.50  4.65  5.30  5.20  4.75  4.85  5.30  5.20   5.20
[12,]  4.25  4.55  5.20  5.00  5.45  4.80  4.90  5.15  5.30   5.00
[13,]  4.30  4.70  4.55  5.05  5.35  4.85  5.00  4.90  5.75   4.60
[14,]  4.70  4.35  4.95  5.25  5.40  4.85  4.90  5.20  5.40   5.20
[15,]  4.55  4.70  4.40  5.15  5.20  4.70  4.80  5.45  6.00   4.90

我没有具体限制于此功能或包，任何其他建议将不胜感激。

非常感谢

最佳答案

不使用包，你可以这样做:

# your data
set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)

# reset seed for this sampling exercise; define sample size and # iterations    
set.seed(1)
samp_size <- 5
iter <- 15

# here are 15 blocks of 5 numbers, which will index rows of your matrix x
samp_mat <- matrix(sample(1:nrow(x), samp_size*iter, replace=T),
                   ncol=samp_size, byrow=T)

# example, look at the first 3 blocks:
samp_mat[1:3,]

#       [,1] [,2] [,3] [,4] [,5]
# [1,]    6    8   12   19    5
# [2,]   18   19   14   13    2
# [3,]    5    4   14    8   16

# so, you can get the colMeans for the first block like this 
# (i.e colMeans for rows 6  8 12 19  5, in this case)
colMeans(x[samp_mat[1,],])

# for all 15 blocks:
t(apply(samp_mat, 1, function(i) colMeans(x[i,])))

...如果您想将其全部合并到一个语句中，您可以:

t(apply(matrix(sample(1:nrow(x), 5*15, replace=T), ncol=5, byrow=T), 1,
        function(i) colMeans(x[i, ])))

(但这显然可读性较差)

关于r - R中按行引导/重采样矩阵，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27409782/

r - R中按行引导/重采样矩阵

上一篇：spotify - 有没有办法使用 Spotify API 获取艺术家的用户 ID？

下一篇：WiX 工具集 EXE 包装安装程序，无文件