r - 如何通过 R 中的变量计算疾病患病率

标签 r dplyr

我完全迷失在尝试根据变量(在我的例子中是邮政编码)来计算我的疾病患病率。我已经尝试了一切,但似乎没有任何效果:(

我知道疾病患病率很容易计算(患病总数除以总人口),但它不允许我对病例进行求和,并按邮政编码对人口进行求和,然后将它们分开。

我试图计算患病率的列称为“莱姆病”,它是一个逻辑变量(0=负,1=正)。然后“FSA”一栏是我的邮政编码。请帮忙!

这是我的代码:

Data.All.df <- data.frame(Data.All) ## Create Data Frame from Data file
Data.All.df.2008 <- subset(Data.All.df, Year=="2008") ##only use 2008
library(dplyr)
Data.All.df.2008 <- Data.All.df.2008 %>% 
                              group_by(FSA) %>% 
                              mutate_each(funs(Cases = ((Lyme=="1")/((Lyme=="0")+(Lyme=="1")))))```


X.1 X Source Patient Accession Customer Year Date Country City Province Postal Name Age Gender Species Breed SNAP Apspp Ehrspp HW Lyme Coinfections dupID FSA
1710    4913    4913    Veterinary Clinic   Bronson Sprartacus796575981360  7.97e+13    79657   2008    2008-01-08  Canada  WINDSOR ON  N8N 3T4 Bronson Sprartacus  132 Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8N
1711    4915    4915    Veterinary Clinic   Scotty9233669481432 9.23e+13    92336   2008    2008-01-08  Canada  WINDSOR ON  N8R 1A5 Scotty  84  Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8R
1712    4916    4916    Veterinary Clinic   Hershey9233683161435    9.23e+13    92336   2008    2008-01-08  Canada  WINDSOR ON  N8R 1A5 Hershey 48  Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8R
1713    4918    4918    Veterinary Clinic   Brandy7965736441362 7.97e+13    79657   2008    2008-01-09  Canada  WINDSOR ON  N8N 3T4 Brandy  156 Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8N
1714    4919    4919    Veterinary Clinic   Trish9233699481443  9.23e+13    92336   2008    2008-01-10  Canada  WINDSOR ON  N8R 1A5 Trish   132 Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8R
1715    4929    4929    Veterinary Clinic   Lexie8001685020761364   8.00e+13    80016   2008    2008-01-17  Canada  HALIFAX NS  B3L 2C2 Lexie   29  Spayed  Canine  Non-Sporting    4Dx 0   0   0   0   0   TRUE    B3L
1716    4937    4937    Veterinary Clinic   CUBBIE79700431  7.97e+12    79700   2008    2008-01-21  Canada  DARTMOUTH   NS  B2W 2N3 CUBBIE  118 Spayed  Canine  Non-Sporting    4Dx 0   0   0   0   0   TRUE    B2W
1717    4945    4945    Veterinary Clinic   Stevie7965765291433 7.97e+13    79657   2008    2008-01-25  Canada  WINDSOR ON  N8N 3T4 Stevie  36  Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8N
1718    4947    4947    Veterinary Clinic   Bailey9233644191501 9.23e+13    92336   2008    2008-01-25  Canada  WINDSOR ON  N8R 1A5 Bailey  132 Not Specified   Canine  Not Specified   4Dx 0   0   0   0   0   TRUE    N8R
1719    4948    4948    Veterinary Clinic   ZAK925369448482 9.25e+12    92536   2008    2008-01-25  Canada  HUNTSVILLE  ON  P1H 1B5 ZAK 96  Neutered    Canine  Hound   4Dx 0   0   0   0   0   TRUE    P1H
17

最佳答案

使用以下最小示例数据:

# Generate data.
set.seed(0934)
Data.All.df.2008 <- data.frame(FSA = sample(c("N8N", "N8R", "B3L", "P1H"), 50, T),
                               Lyme = sample(0:1, 50, T),
                               stringsAsFactors = F)

# First 10 observations.
head(Data.All.df.2008)

#   FSA Lyme
# 1 N8N    1
# 2 P1H    1
# 3 N8N    0
# 4 P1H    0
# 5 N8N    1
# 6 N8N    1

患病率可以通过阳性诊断数除以观察总数来计算,即 sum(Lyme)/n()。适当的函数是summarise:

library(dplyr)

Data.All.df.2008 %>% 
    group_by(FSA) %>% 
    summarise(Prevalence = sum(Lyme)/n())

# # A tibble: 4 x 2
#   FSA   Prevalence
#   <chr>      <dbl>
# 1 B3L        0.778
# 2 N8N        0.571
# 3 N8R        0.583
# 4 P1H        0.467

关于r - 如何通过 R 中的变量计算疾病患病率,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59941654/

相关文章:

r - 在数据框上定义和应用自定义容器

r - R:如何从nlme调用中获取黑森州

python - Reticulate - 在 Rmarkdown 中运行 python block

r - ggpubr的compare_means和base R的pairwise.t.test给出了不同的结果

r - 计算在多项选择题中选择一个选项同时选择其他每个选项的调查回复的比例

onchange - 在 ClojureScript/Reagent 中绑定(bind)更改

r - 如何为包括管道的代码创建循环

r - 使用 `subset` 函数按列名进行矩阵子集化

r - dplyr 可以总结多个变量而不列出每个变量吗?

r - 按组标准化