python - 在 Python 中使用正则表达式将一个字符串替换为另一个字符串 : Error: re. 错误:位置 0 处的错误转义\w

标签 python regex

我正在尝试替换出现的;例如“word one”和“word_one”。用“_”替换空格。

这是我的代码:

labels_ls = ['word <= 0.01', 'word_two <= 0.23', 'word three <= 0.01']

regex_whitespace = r'\w+\s+\w+\b'
new_regex = r'\w+\_+\w+\b'
pattern = re.compile(regex_whitespace) # this I just added after reviewing other related questions

# Loop through labels_ls to find any ngrams whitespace separated labels (i.e gilt maximal)

for i in labels_ls:
    if re.match(regex_whitespace, i):
        # replace the whitespace with a '_' to form gilt*maximal
        new_string = re.sub(pattern, new_regex, i)
        print('new string: ', new_string)

我在这里测试了我的正则表达式 https://pythex.org ,它按要求工作,但是当我运行这段代码时,出现以下错误:

re.error: 错误转义\w 在位置 0

我已经查看了所有相关的已回答问题:

how to fix - error: bad escape \u at position 0

Regex: Replace one pattern with another

我已尝试删除上述问题中提到的正则表达式之前的 r,但它仍然不起作用。

我也试过使用 compile() 但这也没有解决问题

labels_ls = ['internal_punctuation <= 0.042', 'darf <= 0.717', 'formal_global_yes <= 0.5', 'wert <= 0.272', 'signal <= 0.5', 'Flesch_Index <= 0.813', 'zulass <= 0.379', 'polarity <= 0.713', 'Nb_of_auxiliary <= 0.071', 'gini = 0.0', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'weakwords_nb <= 0.143', 'passive_global_yes <= 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'passive_global_yes <= 0.5', 'WPS <= 0.062', 'measurement_values_no <= 0.5', 'gini = 0.0', 'SPW <= 0.575', 'weird_words <= 0.042', 'weakwords_nb <= 0.036', 'SPW <= 0.272', 'gini = 0.0', 'words_nb <= 0.033', 'gini = 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Flesch_Index <= 0.774', 'SPW <= 0.331', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.375', 'SPW <= 0.111', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'weird_words <= 0.208', 'zsdf <= 0.5', 'signal <= 0.297', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.164', 'Aux_Start_no <= 0.5', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'werden <= 0.125', 'darf <= 0.297', 'polarity <= 0.925', 'SPW <= 0.376', 'WPS <= 0.11', 'numerical_values <= 0.091', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'WPS <= 0.11', 'gini = 0.0', 'gini = 0.0', 'polarity <= 0.25', 'gini = 0.0', 'Flesch_Index <= 0.663', 'words_nb <= 0.033', 'SPW <= 0.475', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.125', 'gini = 0.56', 'gini = 0.0', 'Flesch_Index <= 0.75', 'gini = 0.444', 'gini = 0.0', 'Aux_Start_yes <= 0.5', 'darf <= 0.241', 'Nb_of_verbs <= 0.156', 'gini = 0.0', 'SPW <= 0.246', 'polarity <= 0.675', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'numerical_values <= 0.227', 'funktion <= 0.348', 'internal_punctuation <= 0.458', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.031', 'gini = 0.0', 'Flesch_Index <= 0.409', 'gini = 0.0', 'numerical_values <= 0.136', 'WPS <= 0.065', 'darf <= 0.359', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'formal_global_no <= 0.5', 'WPS <= 0.164', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gilt randbeding <= 0.181', 'fahrzeug <= 0.352', 'gini = 0.0', 'zulass <= 0.082', 'gini = 0.0', 'gini = 0.0', 'fur <= 0.194', 'weakwords_nb <= 0.321', 'gini = 0.444', 'gini = 0.0', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'Nb_of_verbs <= 0.344', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.178', 'gini = 0.0', 'words_nb <= 0.224', 'gini = 0.0', 'gini = 0.0']

最佳答案

你需要使用

regex_whitespace = r'(\w+)\s+(\w+)\b'

然后是:

new_string = re.sub(pattern, r'\1_\2', i)

参见 Python demo online .

关键是你需要将与第一个正则表达式匹配的单词字符捕获到capturing groups中然后使用 backreferences到匹配的组值。 new_regex = r'\w+\_+\w+\b' 是多余的,因为您不能将正则表达式模式作为替换,替换模式只能包含反向引用和转义序列(文字反斜杠必须是逃到那里)。

关于python - 在 Python 中使用正则表达式将一个字符串替换为另一个字符串 : Error: re. 错误:位置 0 处的错误转义\w,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56008877/

相关文章:

java - 需要使用正则表达式检查通配符

javascript - 使用正则表达式/列表获取正确的数据

python - 如何加载.kv 文件?

python - 有条件分割字符串

python - 如何从字符串中选择某些数字

javascript - 如何匹配上升的数字序列?

javascript - 正则表达式的 jQuery 变量设置

python - 如何将 Sprite 从矩形更改为图像?

python - 我收到 'continuation line under-indented for visual indent' 错误

javascript - 相似 URL 的正则表达式模式匹配