python - 简单的Python正则表达式问题

我正在尝试使用正则表达式模块编写一小段代码，该模块将从 .csv 文件中删除 URL 的一部分，并返回选定的 block 作为输出。如果该部分以 .com/go/结尾，我希望它在“go”之后返回内容。代码如下:

import csv
import re

with open('rtdata.csv', 'rb') as fhand:
    reader = csv.reader(fhand)
    for row in reader:
        url=row[6].strip()
        section=re.findall("^http://www.xxxxxxxxx.com/(.*/)", url)
        if section==re.findall("^go.*", url):
            section=re.findall("^http://www.xxxxxxxxx.com/go/(.*/)", url)

        print url
        print section

这是一些示例输入输出:

示例 1
1. 输入:http://www.xxxxxxxxx.com/go/news/videos/
2. 输出:新闻/视频
示例 2
1. 输入:http://www.xxxxxxxxx.com/new-cars/
2. 输出:新车

我在这里缺少什么？

最佳答案

尝试以下操作

s = re.search('http://www.xxxxxxxxx.com/(go/)?(.*)/', url)
section = s.group(2)

而不是

    section=re.findall("^http://www.xxxxxxxxx.com/(.*/)", url)
    if section==re.findall("^go.*", url):
        section=re.findall("^http://www.xxxxxxxxx.com/go/(.*/)", url)

所用正则表达式的直观说明:

http://www.xxxxxxxxx.com/(go/)?(.*)/

Regular expression visualization

Debuggex Demo

关于python - 简单的Python正则表达式问题，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19552278/

上一篇：python - Pandas Dataframe 中的数据类型转换问题

下一篇：python - 使用 Python 提取(而不是读取)大型 gzip 文件

python - CSV 到 dict，dict 找不到该项目

ruby - 正则表达式提取两个字符串之间的字符串

java spring模式匹配字符串

csv - 将 CSV 文件从 Google Drive 加载到 BigQuery

python - 用正则表达式匹配两个 Python 列表，并创建字典输出

python - Numba 中的 bool 签名

python - 字典键和 eval 中的破折号

python - 从查询结果中获取一列的值列表

python - 非 ASCII 字符的正则表达式