我想在 R 中重写 python 代码(实际上是 Jupyter Book)。它是关于计算一些数据的 t 检验函数,然后使用箱线图将其可视化。
我是 Python 和 R 的初学者,但我做了一些尝试。这是Python中的代码:
import math
import numpy as np
import pandas as pd
from myst_nb import glue
from scipy.stats import ttest_ind
from matplotlib import pyplot as plt
labels = ['non-failing heart (NF)', 'failing heart (F)']
data = [(99, 52), (96, 40), (100, 38), (105, 18),
(np.nan, 11), (np.nan, 5), (np.nan, 42),
(np.nan, 55), (np.nan, 53), (np.nan, 39),
(np.nan, 42), (np.nan, 50)]
df = pd.DataFrame.from_records(data, columns=labels)
tt = ttest_ind(df['non-failing heart (NF)'],
df['failing heart (F)'],
equal_var=False, nan_policy='omit')
pvalue = tt.pvalue
glue('pvalue', math.ceil(pvalue * 1000.0) / 1000.0)
这是我尝试过的:
library(math)
labels(data) <- c("non-failing heart (NF)", "failing heart (F)")
library(reticulate)
np <- import("numpy", convert=FALSE)
(x <- np$arange(1, 9)$reshape(2L, 2L))
## [[ 99. 52.]
## [ 96. 40.]
## [ 100. 38.]
## [ 105. 18.]
## [ np.nan. 11.]
## [ np.nan. 5.]
## [ np.nan. 42.]
## [ np.nan. 55.]
## [ np.nan 53.]
## [ np.nan 39.]
## [ np.nan. 42.]
## [ np.nan 50.]
## [ 23. 24.]]
df = pd.DataFrame.from_records(data, columns=labels)
tt = ttest_ind(df['non-failing heart (NF)'],
df['failing heart (F)'],
equal_var=False, nan_policy='omit')
pvalue = tt.pvalue
print(pvalue)
最佳答案
如上所述,t.test
是 R 的 stats
库中众多内置统计方法之一。因此,只需构建相同的数据框,然后运行测试并根据需要提取测试统计数据即可。
数据构建 (复制pd.DataFrame.from_records()
的一些争论)
labels <- list('non-failing heart (NF)', 'failing heart (F)')
data <- list(c(99, 52), c(96, 40), c(100, 38), c(105, 18),
c(NA_integer_, 11), c(NA_integer_, 5), c(NA_integer_, 42),
c(NA_integer_, 55), c(NA_integer_, 53), c(NA_integer_, 39),
c(NA_integer_, 42), c(NA_integer_, 50))
df <- setNames(do.call(rbind.data.frame,
lapply(data, function(d) data.frame(d[1], d[2]))),
labels)
df
# non-failing heart (NF) failing heart (F)
# 1 99 52
# 2 96 40
# 3 100 38
# 4 105 18
# 5 NA 11
# 6 NA 5
# 7 NA 42
# 8 NA 55
# 9 NA 53
# 10 NA 39
# 11 NA 42
# 12 NA 50
T 检验
results <- t.test(df[['non-failing heart (NF)']], df[['failing heart (F)']])
results
# Welch Two Sample t-test
# data: df[["non-failing heart (NF)"]] and df[["failing heart (F)"]]
# t = 12.114, df = 13.43, p-value = 1.311e-08
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# 51.73232 74.10101
# sample estimates:
# mean of x mean of y
# 100.00000 37.08333
results$statistic
# t
# 12.11356
results$estimate
# mean of x mean of y
# 100.00000 37.08333
results$p.value
# [1] 1.311125e-08
ceiling(results$p.value * 1000.0)/ 1000.0
# [1] 0.001
关于python - 在 R 中用 python 重写 t 检验的问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64272528/