我有一个数据框,我需要根据另一列中匹配的值填充一些列。
运行此代码以获取示例数据集
sample_data <- structure(list(Temp = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_), Wind = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_), NodeID = c(3, 5,
6, 8, 9), node_path = c("Temp <= 82 , Wind <= 6.9", "Temp <= 82 , Wind > 6.9 , Temp <= 77",
"Temp <= 82 , Wind > 6.9 , Temp > 77", "Temp > 82 , Wind <= 10.3",
"Temp > 82 , Wind > 10.3")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
这就是我想要实现的目标。沿 node_path
匹配列名称 Temp
和 Wind
,并将特定的匹配值返回到这些列。我尝试使用 str_extract_all(node_path, pattern = "^Temp.*")
,但这会返回 node_path
下的整个单元格值。知道如何实现这一目标吗?
最佳答案
一个选项(无需 reshape )是通过在“Temp”、“Wind”NA 之间循环来根据列名称 (cur_column()
) 提取元素列并通过循环 list
来粘贴
(toString
) str_extract_all
中的 list
元素> 带有 map
library(dplyr)
library(purrr)
library(stringr)
sample_data %>%
mutate(across(c(Temp, Wind), ~ map_chr(str_extract_all(node_path,
str_c(cur_column(), "\\D+[0-9.]+")), toString)))
-输出
# A tibble: 5 x 4
Temp Wind NodeID node_path
<chr> <chr> <dbl> <chr>
1 Temp <= 82 Wind <= 6.9 3 Temp <= 82 , Wind <= 6.9
2 Temp <= 82, Temp <= 77 Wind > 6.9 5 Temp <= 82 , Wind > 6.9 , Temp <= 77
3 Temp <= 82, Temp > 77 Wind > 6.9 6 Temp <= 82 , Wind > 6.9 , Temp > 77
4 Temp > 82 Wind <= 10.3 8 Temp > 82 , Wind <= 10.3
5 Temp > 82 Wind > 10.3 9 Temp > 82 , Wind > 10.3
或者可以在 base R
中使用 regmatches/regexpr
来根据模式提取值,然后粘贴
toString
sample_data[1:2] <- lapply(names(sample_data)[1:2], function(x)
sapply(regmatches(sample_data$node_path, gregexpr(paste0(x,
"\\D+[0-9.]+"), sample_data$node_path)), toString))
或者另一种选择是使用 separate_rows
拆分行,然后执行 pivot_wider
重新整形回宽格式
library(tidyr)
sample_data %>%
select(node_path, NodeID) %>%
separate_rows(node_path, sep="\\s*,\\s*") %>%
mutate(colnm = word(node_path, 1)) %>%
group_by(colnm, NodeID) %>%
summarise(new = str_c(node_path, collapse=", "), .groups = 'drop') %>%
pivot_wider(names_from = colnm, values_from = new) %>%
left_join(sample_data %>%
select(NodeID, node_path))
-输出
# A tibble: 5 x 4
NodeID Temp Wind node_path
<dbl> <chr> <chr> <chr>
1 3 Temp <= 82 Wind <= 6.9 Temp <= 82 , Wind <= 6.9
2 5 Temp <= 82, Temp <= 77 Wind > 6.9 Temp <= 82 , Wind > 6.9 , Temp <= 77
3 6 Temp <= 82, Temp > 77 Wind > 6.9 Temp <= 82 , Wind > 6.9 , Temp > 77
4 8 Temp > 82 Wind <= 10.3 Temp > 82 , Wind <= 10.3
5 9 Temp > 82 Wind > 10.3 Temp > 82 , Wind > 10.3
关于r - 有没有办法部分匹配文本/字符串并返回 R 中的完整值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68248949/