我有这个列表列表:
my_list=[['word:', 'house', 'garden', '0,2%'],
['word:', 'house', 'garden', '0,2%'],
['house', 'garden', '0,2%'],
['house', 'garden', '0,2%'],
['garden', '0,2%', '0,125%'],
['house', '0,2%', '?????'],
['house', 'garden', '0,02%'],
['house', 'garden', '0,02%'],
['garden', '0,02%'],
['house', 'garden', '0,2%'],
['garden', '0,2%'],
['house', '0,2'],
['house', '0,2', '%'],
['house', 'garden', 'kids', '0,2%'],
['house', 'garden', 'kids', '0,2%'],
['house', '0,2%', 'boy'],
['house', '0,12%'],
['house', '4%.'],
['house', '4%.', '4.'],
['house', '0,2%”.']]
我需要根据单词 house 和 garden 提取数字,以便得到如下内容:
{'garden': ['0,2', '0,2', '0,2', '0,2', '0,2', '0.125', '0.02', '0.02', '0.02', '0.2', '0.2', '0,2'], 'house': ['0.2', '0.2', '0.2', '0.2', '0,2', '0,02', '0,02', '0,2', '0,2', '0,2', '0,2', '0,2', ,'0,2','0,12', '4.', '4.', '4.', '0,2']}
我怎样才能得到这些值?
不幸的是这段代码:
result = defaultdict(list)
for l in my_list:
k = None
for v in l:
if v in keywords:
k = v
if re.match(r'[0-9,.]+$', v):
num = v
if k is not None:
result[k].append(num)
它没有给我预期的输出。
最佳答案
问题出在您的正则表达式上。您需要删除 $
anchor ,如果有任何内容(例如 %
字符)跟随预期字符,即 [0- 9,.]
.其余的代码也可以稍微简化一下:
import re
from collections import defaultdict
my_list=[['word:', 'house', 'garden', '0,2%'],
['word:', 'house', 'garden', '0,2%'],
['house', 'garden', '0,2%'],
['house', 'garden', '0,2%'],
['garden', '0,2%', '0,125%'],
['house', '0,2%', '?????'],
['house', 'garden', '0,02%'],
['house', 'garden', '0,02%'],
['garden', '0,02%'],
['house', 'garden', '0,2%'],
['garden', '0,2%'],
['house', '0,2'],
['house', '0,2', '%'],
['house', 'garden', 'kids', '0,2%'],
['house', 'garden', 'kids', '0,2%'],
['house', '0,2%', 'boy'],
['house', '0,12%'],
['house', '4%.'],
['house', '4%.', '4.'],
['house', '0,2%".']]
result = defaultdict(list)
keywords = ['house', 'garden']
for l in my_list:
numbers = [v for v in l if re.match(r'[0-9,.]+', v)]
for v in l:
if v in keywords:
result[v].extend(numbers)
print(result)
打印:
defaultdict(<class 'list'>, {'house': ['0,2%', '0,2%', '0,2%', '0,2%', '0,2%', '0,02%', '0,02%', '0,2%', '0,2', '0,2', '0,2%', '0,2%', '0,2%', '0,12%', '4%.', '4%.', '4.', '0,2%".'], 'garden': ['0,2%', '0,2%', '0,2%', '0,2%', '0,2%', '0,125%', '0,02%', '0,02%', '0,02%', '0,2%', '0,2%', '0,2%', '0,2%']})
关于python - 从列表列表中提取数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62502800/