R创建没有for循环的新列

假设我有一个第一列有几个数字的数据框。我想获取这些数字，将它们用作字符串中的位置，并获取一个在该位置前后包含 2 个字符的子字符串。澄清一下，

aggSN <- data.frame(V1=c(5,6,7,8),V2="blah")
gen <- "AJSDAFKSDAFJKLASDFKJKA"  # <- take this string
aggSN                            # <- take the numbers in the first column
# V1    V2
#  5  blah
#  6  blah
#  7  blah
#  8  blah

并创建一个看起来像的新列 V3

aggSN                           
# V1    V2    V3
#  5  blah SDAFK   # <- took the two characters before and after the 5th character
#  6  blah DAFKS   # <- took the two characters before and after the 6th character 
#  7  blah AFKSD   # <- took the two characters before and after the 7th character 
# 10  blah SDAFJ   # <- took the two characters before and after the 10th character 
#  2  blah AJSD   # <- here you can see that it the substring cuts off

目前我正在使用 for 循环，它可以工作，但在处理非常大的数据帧和大字符串时会花费大量时间。还有其他选择吗？谢谢。

fillvector <- ""
for(j in 1:nrow(aggSN)){fillvector[j] <- substr(gen,aggSN[j,V1]-2,aggSN[j,V1]+2)}
aggSN$V9 <- fillvector

最佳答案

无需编写循环即可使用substring()

aggSN <- data.frame(V1=c(5,6,7,8,2),V2="blah")
gen <- "AJSDAFKSDAFJKLASDFKJKA" 

with(aggSN, substring(gen, V1-2, V1+2))
# [1] "SDAFK" "DAFKS" "AFKSD" "FKSDA" "AJSD"

所以要添加新列，

aggSN$V3 <- with(aggSN, substring(gen, V1-2, V1+2))
aggSN
#   V1   V2    V3
# 1  5 blah SDAFK
# 2  6 blah DAFKS
# 3  7 blah AFKSD
# 4  8 blah FKSDA
# 5  2 blah  AJSD

如果您想要更快一些，我会使用 stringi::stri_sub 代替 substring()。

关于R创建没有for循环的新列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31778557/

R创建没有for循环的新列

上一篇：eclipse - 用于 Eclipse 的 STM32 库

下一篇：r - 如何计算数据框中以 R 中的序列开头的单元格百分比？