r - 如何定义 `f_n-chi-square and use ` uniroot` 的函数来查找置信区间?

标签 r statistics confidence-interval log-likelihood

我想获得以下问题的 95% 置信区间。 enter image description here

我在 R 代码中编写了函数 f_n 。我首先使用 Normal 随机采样 100 个样本,然后为 lambda 定义函数 h。然后我就可以得到f_n。我的问题是如何定义 f_n-chi-square 函数并使用 uniroot` 来查找置信区间。

# I first get 100 samples 
set.seed(201111)
x=rlnorm(100,0,2)

根据@RuiBarradas的回答,我尝试以下代码。

set.seed(2011111)
# I define function h, and use uniroot function to find lambda
h <- function(lam, n)
{
  sum((x - theta)/(1 + lam*(x - theta)))
}

# sample size
n <- 100
# the parameter of interest must be a value in [1, 12],
#true_theta<-1
#true_sd<- exp(2)
#x <- rnorm(n, mean = true_theta, sd = true_sd)
x=rlnorm(100,0,2)

xmax <- max(x)
xmin <- min(x)
theta_seq = seq(from = 1, to = 12, by = 0.01)

f_n <- rep(NA, length(theta_seq))

for (i in seq_along(theta_seq))
{
  theta <- theta_seq[i]
  lambdamin <- (1/n-1)/(xmax - theta)
  lambdamax <- (1/n-1)/(xmin - theta)
  lambda = uniroot(h, interval = c(lambdamin, lambdamax), n = n)$root
  f_n[i] = -sum(log(1 + lambda*(x - theta)))
}
j <- which.max(f_n)
max_fn <- f_n[j]
mle_theta <- theta_seq[j]

plot(theta_seq, f_n, type = "l", 
     main = expression(Estimated ~ theta),
     xlab = expression(Theta),
     ylab = expression(f[n]))
points(mle_theta, f_n[j], pch = 19, col = "red")
segments(
  x0 = c(mle_theta, xmin),
  y0 = c(min(f_n)*2, max_fn),
  x1 = c(mle_theta, mle_theta),
  y1 = c(max_fn, max_fn),
  col = "red",
  lty = "dashed"
)

我得到了以下f_n图。

enter image description here

对于 95% CI,我尝试


LR <- function(theta, lambda)
{
  2*sum(log(1 + lambda*(x - theta))) - qchisq(0.95, df = 1)
}

lambdamin <- (1/n-1)/(xmax - mle_theta)
lambdamax <- (1/n-1)/(xmin - mle_theta)
lambda <- uniroot(h, interval = c(lambdamin, lambdamax), n = n)$root

uniroot(LR, c(xmin, mle_theta), lambda = lambda)$root

结果是0.07198144。那么对数就是log(0.07198144)=-2.631347

但是下面的代码中有NA

uniroot(LR, c(mle_theta, xmax), lambda = lambda)$root

因此 95% CI 为 theta >= -2.631347

但问题是 95% CI 应该是一个闭区间...

最佳答案

这是一个解决方案。

首先,数据生成代码错误,参数theta在区间[1, 12]内,用生成数据rnorm(., 均值 = 0, .)。我将其更改为 true_theta = 5

set.seed(2011111)
# I define function h, and use uniroot function to find lambda
h <- function(lam, n)
{
  sum((x - theta)/(1 + lam*(x - theta)))
}

# sample size
n <- 100
# the parameter of interest must be a value in [1, 12],
true_theta <- 5
true_sd <- 2
x <- rnorm(n, mean = true_theta, sd = true_sd)
xmax <- max(x)
xmin <- min(x)
theta_seq <- seq(from = xmin + .Machine$double.eps^0.5, 
                 to = xmax - .Machine$double.eps^0.5, by = 0.01)
f_n <- rep(NA, length(theta_seq))

for (i in seq_along(theta_seq))
{
  theta <- theta_seq[i]
  lambdamin <- (1/n-1)/(xmax - theta)
  lambdamax <- (1/n-1)/(xmin - theta)
  lambda = uniroot(h, interval = c(lambdamin, lambdamax), n = n)$root
  f_n[i] = -sum(log(1 + lambda*(x - theta)))
}
j <- which.max(f_n)
max_fn <- f_n[j]
mle_theta <- theta_seq[j]

plot(theta_seq, f_n, type = "l", 
     main = expression(Estimated ~ theta),
     xlab = expression(Theta),
     ylab = expression(f[n]))
points(mle_theta, f_n[j], pch = 19, col = "red")
segments(
  x0 = c(mle_theta, xmin),
  y0 = c(min(f_n)*2, max_fn),
  x1 = c(mle_theta, mle_theta),
  y1 = c(max_fn, max_fn),
  col = "red",
  lty = "dashed"
)

LR <- function(theta, lambda)
{
  2*sum(log(1 + lambda*(x - theta))) - qchisq(0.95, df = 1)
}

lambdamin <- (1/n-1)/(xmax - mle_theta)
lambdamax <- (1/n-1)/(xmin - mle_theta)
lambda <- uniroot(h, interval = c(lambdamin, lambdamax), n = n)$root

uniroot(LR, c(xmin, mle_theta), lambda = lambda)$root
#> [1] 4.774609

reprex package 于 2022 年 3 月 25 日创建(v2.0.1)

单侧 CI95 为 theta >= 4.774609

关于r - 如何定义 `f_n-chi-square and use ` uniroot` 的函数来查找置信区间?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71593693/

相关文章:

r - 从 .onLoad 包函数中调用 getNamespaceExports()

r - dplyr `pivot_longer()` 对象未找到,但它就在那里?

r - 如何使用R在散点图上绘制R平方值?

c++ - 用 C++ 快速实现反不完全 Beta 函数

r - 如何使用带有 set_value_labels 的 mutate_at 来更改多个变量的值标签?

r - 在 R 中计算每个周期的方差

python - 使用 sklearn GMM 计算概率

python - 将置信区间计算为分位数

graph - 在 gnuplot 中绘制第 n 行的平均值

r - R 中置信区间外的有条件的颜色数据点