我有一个名为talk
的字符串变量。假设我想在 talk
中查找单词“please”的所有实例,并在每一行中为每个“please”添加一个后缀,其中包含该单词的递增计数。
例如,如果 talk
如下所示:
"will you please come here please do it as soon as you can if you please"
我希望它看起来像这样:
"will you please1 come here please2 do it as soon as you can if you please3"
换句话说,“please1”表示这是第一个出现的“please”,“please2”是第二个出现的,依此类推。
我使用正则表达式和几个循环编写了一些代码(如下),但它不能完美工作,即使我可以解决问题,它似乎过于复杂。 有更简单的方法吗?
# I first extract the portion of 'talk' beginning from the 1st please to the last
gen talk_pl = strtrim(stritrim(regexs(0))) if regexm(talk, "please.+please")
# I count the number of times "please" occurs in 'talk_pl'
egen count = noccur(talk_pl), string("please")
# in the loop below, x = 2nd to last word; i = 3rd to last word
qui levelsof count
foreach n in `r(levels)' {
local i = `n' -1
local x = `i' -1
replace talk_pl = regexrf(talk_pl, "please$", "please`n'") if count == `n'
replace talk_pl = regexrf(talk_pl, "please (?=.+?please`n')", "please`i' ") if count == `n'
replace talk_pl = regexrf(talk_pl, "please (?=.+?please`i')", "please`x' ") if count == `n'
}
最佳答案
* Example generated by -dataex-. To install: ssc install dataex
clear
input str71 talk
"will you please come here please do it as soon as you can if you please"
end
// Install egenmore if not installed already
* ssc install egenmore
clonevar wanted = talk
// count occurrences of "please"
egen countplease = noccur(talk), string(please)
// Loop over 1 to max number of occurrences
sum countplease, meanonly
forval i = 1/`r(max)' {
replace wanted = ustrregexrf(wanted, "\bplease\b", "please`i'")
}
list
+---------------------------------------------------------------------------------------+
1. | talk |
| will you please come here please do it as soon as you can if you please |
|---------------------------------------------------------------------------------------|
| wanted | countp~e |
| will you please1 come here please2 do it as soon as you can if you please3 | 3 |
+---------------------------------------------------------------------------------------+
关于regex - 如何在 Stata 中每次出现字符串时向字符串添加递增值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65604864/