python - 如何显示完整结果,而不是 python 中正则表达式搜索的匹配文本

标签 python python-3.x regex python-re findall

我正在创建一个根据关键字搜索文件的脚本,我的输出应该是整个观察结果,而不仅仅是匹配的文本,但我发现 .group 对此不起作用。

import re 
import os 
 
pers_info = pd.read_csv(r".....StateWorkforceMailingList_2-7-19a.csv",encoding='utf-8')

Pers_info['State'] = Texas, Florida etc... 

 files=os.listdir(r"....\State Files")
 
Files = list of WORKFORCE_2017_ALABAMA_FILE.xlsx,...,n

matches=re.findall(pers_info.State[4], files.replace("_", " "),re.I)
print(match) 

我的预期输出是 WORKFORCE_2017_ALABAMA_FILE.xlsx 相反,我得到“阿拉巴马州”

我应该尝试 bool 掩码吗?

最佳答案

使用

>>> import pandas as pd
>>> Pers_info = pd.DataFrame({'State':['Texas', 'Alabama', 'Florida']})
>>> Files = ['WORKFORCE_2017_ALABAMA_FILE.xlsx', 'WORKFORCE_2017_FILE.xlsx']
>>> pattern = re.compile(rf'(?<![^\W_])(?:{"|".join(Pers_info["State"].to_list())})(?![^\W_])', re.I)
>>> list(filter(pattern.search, Files))
['WORKFORCE_2017_ALABAMA_FILE.xlsx']

参见regex proof .

说明

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    [^\W_]                   any character except: non-word
                             characters (all but a-z, A-Z, 0-9, _),
                             '_'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    Texas                    'Texas'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    Alabama                  'Alabama'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    Florida                  'Florida'
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    [^\W_]                   any character except: non-word
                             characters (all but a-z, A-Z, 0-9, _),
                             '_'
--------------------------------------------------------------------------------
  )                        end of look-ahead

关于python - 如何显示完整结果,而不是 python 中正则表达式搜索的匹配文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66606769/

相关文章:

python - 在线程上调用时,Python函数无法开始执行

python - 如何通过对第 3 列中的值求和来将前 2 列中具有相同值的 Pandas Dataframe 行组合在一起?

regex - 如何在Python中使用正则表达式从字符串中获取日期?

javascript - 大型数组的 angular.copy 面临着非常糟糕的性能

python - Cloud Dataflow 写入 BigQuery Python 错误

python - Django:正确 request.POST 中的值但无法保存表单

python - PyTorch: 'ToTensor()' 将彩色图片变成 9 张灰度图片

python - 如何使用 nltk 使用 line_tokenize 或 word_tokenize 分隔新行?

python - 具有多个组的正则表达式无法捕获括号

regex - 将姓氏拆分到新行