r - 匹配并提取r中的子字符串

标签 r regex string stringr

我有一个逐行字符的文本数据,都是字符串。

[1]"1128=9,9=282,35=X,34=4846318,52=20140107224500037,34=20140107,268=3,279=0,22=8,48=637548,83=585590,107=ZCH4,269=4,270=425,273=224500000,286=5,279=0,22=8,48=637548,83=585591,107=ZCH4,269=E,273=425.5,273=224500000,279=0,273=8,48=637548,34=585592,107=ZCH4,269=F,270=425,271=100,273=224500000,10=144"
[2]"1128=9,9=467,35=X,34=4846344,52=20140107224500107,75=20140108,268=5,279=0,22=8,48=772825,279=0,22=8,48=692825,83=434250,107=ZCZ4,269=E,270=453,271=41,273=224500000,279=0,22=8,48=692007,83=434251,107=ZCZ4,269=F,270=452.75,273=224500000,279=0,22=8,48=35213,83=434252274=2,336=0,451=0.25,279=1,22=8,48=692825,83=434253,107=ZCZ4,269=1,270=453,271=51,273=224500000,336=0,346=17,1023=1,10=239"

我想截断数据,只提取以“48=”和“34=”开头的子字符串,

我当前的代码是:

ex_between(data, c('48=', '34='), c(',', ','), extract=TRUE)

它有效,但它也截断了我想保留的“48=”和“34=”部分。

期望的结果:

[1]"34=4846318,34=20140107,48=637548,48=637548,48=637548,34=585592"
[2]34=4846344,48=772825,48=692825,48=692007,48=35213,48=692825"

截断数据中元素“34=....”和“48=....”的顺序需要与原始数据中的顺序相同。

最佳答案

关于:

# Sample strings
x <- c("1128=9,9=282,35=X,34=4846318,52=20140107224500037,34=20140107,268=3,279=0,22=8,48=637548,83=585590,107=ZCH4,269=4,270=425,273=224500000,286=5,279=0,22=8,48=637548,83=585591,107=ZCH4,269=E,273=425.5,273=224500000,279=0,273=8,48=637548,34=585592,107=ZCH4,269=F,270=425,271=100,273=224500000,10=144",
"1128=9,9=467,35=X,34=4846344,52=20140107224500107,75=20140108,268=5,279=0,22=8,48=772825,279=0,22=8,48=692825,83=434250,107=ZCZ4,269=E,270=453,271=41,273=224500000,279=0,22=8,48=692007,83=434251,107=ZCZ4,269=F,270=452.75,273=224500000,279=0,22=8,48=35213,83=434252274=2,336=0,451=0.25,279=1,22=8,48=692825,83=434253,107=ZCZ4,269=1,270=453,271=51,273=224500000,336=0,346=17,1023=1,10=239")

unlist(lapply(strsplit(x, ","), function(x) 
    paste(x[grep("(48=\\d+|34=\\d+)", x)], collapse = ",")));
#[1] "34=4846318,34=20140107,48=637548,48=637548,48=637548,34=585592"
#[2] "34=4846344,48=772825,48=692825,48=692007,48=35213,48=692825"

关于r - 匹配并提取r中的子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47493702/

相关文章:

r - 制作稀疏矩阵时出错

jquery - 用于从逻辑表达式中提取操作数和运算符以及括号的正则表达式

python - 如何在基于字符分隔符将列表拆分为子列表时跳过空子字符串

python - 正则表达式从字符串中获取子字符串

php - 如何用javascript删除引号之间的字符串内容?

字符指针(字符串)与 C 中其他指针的比较

r - 使用 scale_x_datetime 时没有前导零

r - 按特定顺序从矩阵中选择行

r - 带有辅助变量的颜色的 ggplot 图例

c# - 如果一组符号仅重复,如何使正则表达式匹配?