我不太擅长循环,但我正在努力更好地处理它们。我使用 tidycensus 来选择并提取全年的一些变量(下面示例中的虚拟数据具有代表性)。因此,对于给定的一组选定变量(dv_acs),我想提取综合密码本中的信息,您可以通过每年的 load_variables 下载这些信息,然后使用 full_join 它们。在大多数情况下,多年来这都是相同的信息,但我希望完整地提供这些信息,以便我可以仔细检查它并记下任何差异。
这是有效的设置:
library(tidycensus)
library(dplyr)
#getting codebook for all ACS years for every single variable possible
for(x in c(2009:2020)) {
filename <- paste0("v", x)
assign(filename, (load_variables(x, "acs5", cache = TRUE)))
}
#selecing and recoding variables to pull in
dv_acs = c(
hus = "B25002_001",
husocc = "B25002_002",
husvac = "B25002_003"
)
这一次完成了我一年想要的目标,从中我可以一点一点地进行完整的装订
#creating a codebook a year at a time for variables I'm interested in
codebook <- v2009 %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(codebook) = c("id", "name", "label_2009", "concept_2009")
codebook2 <- v2010 %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(codebook2) = c("id", "name", "label_2010", "concept_2010")
codebook <- full_join(codebook, codebook2, by=c("id", "name"))
这里是我尝试一次循环为我的特定变量创建代码本的地方,但失败了:
#creating a loop to pull in an join a codebook for all years
for(x in c(2009:2010)){
codebook <- data.frame(matrix(ncol = 2, nrow = 0)) #create a master file I can join the the files to as they load in through the loop
colnames(codebook) <- c("id", "name") #giving right label names
filename <- paste0("v", x) #this is where I'm starting to have trouble; this saves as a value, and I can't then use it to call the dataframe
temp <- filename %>% (name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(temp) <- c("id", "name", paste0("label_", x), paste0("concept_", x))
codebook <- full_join(codebook, temp, by=c("id", "name"))
}
报告的错误是:“名称 %in% dv_acs 中出现错误:未找到对象“名称””
最佳答案
最好不要在全局环境中创建对象。相反,它可以存储在列表
中。在这里,可以使用 mget
library(stringr)
library(purrr)
library(dplyr)
out <- mget(str_c("v", 2009:2020)) %>%
imap(~ {
nm <- str_c(c("label", "concept"), str_remove(.y, "v"))
.x %>%
select(-any_of("geography")) %>%
filter(name %in% dv_acs) %>%
arrange(match(name, dv_acs)) %>%
mutate(id = names(dv_acs), .before = 1) %>%
rename_with(~ nm, c("label", "concept"))
}) %>%
reduce(full_join)
-输出
> out
# A tibble: 3 × 26
id name label…¹ conce…² label…³ conce…⁴ label…⁵ conce…⁶ label…⁷ conce…⁸ label…⁹ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 hus B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
2 huso… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
3 husv… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
# … with 5 more variables: concept2018 <chr>, label2019 <chr>, concept2019 <chr>, label2020 <chr>, concept2020 <chr>, and abbreviated variable names ¹label2009,
# ²concept2009, ³label2010, ⁴concept2010, ⁵label2011, ⁶concept2011, ⁷label2012, ⁸concept2012, ⁹label2013, ˟concept2013, ˟label2014, ˟concept2014, ˟label2015,
# ˟concept2015, ˟label2016, ˟concept2016, ˟label2017, ˟concept2017, ˟label2018
如果我们想要列表
中的所有内容,而不必在全局环境中创建对象
out <- map(2009:2020, ~ {
nm <- str_c(c("label", "concept"), "_", .x)
load_variables(.x, "acs5") %>%
select(-any_of("geography")) %>%
filter(name %in% dv_acs) %>%
arrange(match(name, dv_acs)) %>%
mutate(id = names(dv_acs), .before = 1) %>%
rename_with(~ nm, c("label", "concept"))
}) %>%
reduce(full_join)
-输出
> out
# A tibble: 3 × 26
id name label…¹ conce…² label…³ conce…⁴ label…⁵ conce…⁶ label…⁷ conce…⁸ label…⁹ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 hus B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
2 huso… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
3 husv… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
# … with 5 more variables: concept_2018 <chr>, label_2019 <chr>, concept_2019 <chr>, label_2020 <chr>, concept_2020 <chr>, and abbreviated variable names
# ¹label_2009, ²concept_2009, ³label_2010, ⁴concept_2010, ⁵label_2011, ⁶concept_2011, ⁷label_2012, ⁸concept_2012, ⁹label_2013, ˟concept_2013, ˟label_2014,
# ˟concept_2014, ˟label_2015, ˟concept_2015, ˟label_2016, ˟concept_2016, ˟label_2017, ˟concept_2017, ˟label_2018
# ℹ Use `colnames()` to see all variable names
关于r - 使用循环为多年来选择的整洁的人口普查变量制作密码本,调用循环数据集时遇到麻烦,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74209434/