我可以对两个样本 t 检验进行此操作,但不能进行中值检验、Wilcoxon 检验或 Hodges Lehmann 检验
data_2000 <- c(500,450,600,700,550,551,552)
data_2019 <- c(560,460,620,720,540,600,750)
mean(data_2000)
mean(data_2019)
mean(data_2019) - mean(data_2000)
combined_data <- c(data_2000, data_2019)
set.seed(123)
null_dist <- c()
for (i in 1:100000) {
shuffled_data <- sample(combined_data)
shuffled_2000 <- shuffled_data[1:7]
shuffled_2019 <- shuffled_data[8:14]
null_dist[i] <- mean(shuffled_2019) - mean(shuffled_2000)
}
(p_value <- (sum(null_dist >= 49.57143) + sum(null_dist <=
`enter code here`-49.57143))/length(null_dist))
最佳答案
我认为这就是您想要做的。我尽可能少地改变了你的代码。有像 infer
这样的包可以为您完成此操作,for 循环不是最有效的,但它已经足够好了,可以帮助您学习。只要我们循环,我就会同时计算平均值和中位数,因为代码的所有其他部分都是相同的。 ifelse
是一种将 1 和 0 相加的简单方法。
data_2000 <- c(500,450,600,700,550,551,552)
data_2019 <- c(560,460,620,720,540,600,750)
delta_mean <- mean(data_2019) - mean(data_2000)
delta_median <- median(data_2019) - median(data_2000)
combined_data <- c(data_2000, data_2019)
trials <- 100000
set.seed(123)
mean_diff <- c()
median_diff <- c()
for (i in 1:trials) {
shuffled_data <- sample(combined_data)
shuffled_2000 <- shuffled_data[1:7]
shuffled_2019 <- shuffled_data[8:14]
mean_diff[i] <- mean(shuffled_2019) - mean(shuffled_2000)
median_diff[i] <- median(shuffled_2019) - median(shuffled_2000)
}
p_mean <- sum(ifelse(mean_diff > delta_mean | mean_diff < -1 * delta_mean, 1, 0)) / trials
p_median <- sum(ifelse(median_diff > delta_median | median_diff < -1 * delta_median, 1, 0)) / trials
p_mean
#> [1] 0.31888
p_median
#> [1] 0.24446
跟进您有关 HL 测试的问题。引用维基百科
The Hodges–Lehmann statistic also estimates the difference between two populations. For two sets of data with m and n observations, the set of two-element sets made of them is their Cartesian product, which contains m × n pairs of points (one from each set); each such pair defines one difference of values. The Hodges–Lehmann statistic is the median of the m × n differences.
您可以使用以下代码在您的数据上运行它...
不要运行 100,000 次,每次的答案都是相同的,因为您已经完成了所有 49 种可能的配对
hl_df <- expand.grid(data_2019, data_2000)
hl_df$pair_diffs <- hl_df$Var1 - hl_df$Var2
median(hl_df$pair_diffs)
[1] 49
关于r - 如何在 R 中编写 Mood 中值检验的排列等价代码? (使用排列获得 p 值),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63315645/