python - 搜索模式不唯一？ - 正则表达式

我想编写一个函数来清理数据帧的索引列。

删除具有高级 ID 的整行。例如删除

东库特尼 (5901) 01010
将索引定制为低级 ID 的 7 位数字。例如，转

东库特尼 A (5901017) RDA 02020 进入5901017
如果有两个括号，则仅保留第二个括号中的 7 位数字。例如，

Sechelt(部分)(5929803) IGD 02020 至 5929803

大写 H(第 1 部分)(5917054) RDA 01020 至 5917054

大写 H(第 2 部分)(5917056) RDA 02030 至 5917056

T'Sou-ke 1 (Sooke 1) (5917817) IRI 01010 至 5917817

T'Sou-ke 2 (Sooke 2) (5917818) IRI 00000 至 5917818

仅适用于一个括号的代码示例是

def extract_id(s):
    m = re.search('\((.*)\)', s)
    if m:
        i = int(m.group(0)[1:-1])
        return i

if __name__ == '__main__':
    # Read data
    census_subdivision_profile = pd.read_excel('../data/census_subdivision_profile.xlsx', sheetname='Data',
                                               index_col='Geography', encoding='utf-8').T
    print(census_subdivision_profile.head())
    print(census_subdivision_profile.shape)

    census_subdivision_profile.index = census_subdivision_profile.index.map(extract_id)
    print(census_subdivision_profile.index)

要查看完整代码，请参阅我之前发布的另一个问题

Merge dataframes that have indices that one contains another (but not the same)

最佳答案

我认为您的意图是'\(([^)]*)\)' ... hth

关于python - 搜索模式不唯一？ - 正则表达式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44789612/

上一篇：python - 如何根据模型实例中的值有条件地更改 django 管理表单？

下一篇：python - mongoengine - 对 EmbeddedDocumentField 的 ListField 进行查询

python显示矩阵除法后的剩余值

java - 需要帮助将字符串写入文本文件中的多行

pandas - 用先前的变量替换数据框中的 NaT

python - 检索包含 NaN 值的行的索引

python - 偏移数据框引用的最Pythonic方法？

python - 如何在 TFRecord 中保存不同长度的列表列表？

python - 使用 os.walk 压缩文件夹

MySql : How can i split the string on single "\" backslash

python - 在 Python 中将字符串分解为单个单词