python - 将列表转换为子列表，同时保留 "key"

我有一个包含“键”和“段落”的列表。每个“键”都与一个“段落”相关联。

我的目标是将每个段落分成单独的句子，每个句子都以段落形式分配给它们最初所属的“键”。例如:

(['2925729', 'Patrick came outside and greeted us promptly.'], ['2925729', 'Patrick did not shake our hands nor ask our names. He greeted us promptly and politely, but it seemed routine.'], ['2925728', 'Patrick sucks. He farted politely, but it seemed routine.'])

现在我已经能够编写代码将句子分成段落，并获取每个句子对字典的命中数。我现在想要将一个 ID 与每个问题相关联。

这是处理没有任何“键”的句子的代码。为了节省空间，我省略了步骤 1 和 2:

Dictionary = ['book', 'should have', 'open']

####Step3#####
#Create Blank list to append final output
final_out = []

##Find Matches
for sent in sentences:
  for sent in sentences:
      final_out.append((sent, sum(sent.count(col) for col in dictionary)))

#####Spit out final distinct output
##Output in dictionary structure
final_out = dict(sorted(set(final_out)))

####Get sentences and rank by max first

import operator
sorted_final_out = sorted(final_out.iteritems(),key = operator.itemgetter(1), reverse = True)

输出是: (['约翰尼吃了羚羊', 80], ['莎莉有一个 friend ',20]) 等等。然后我选择顶部的 X b 震级。我现在想要实现的目标是这样的:(['12222','johny ate the antelope', 80], [22332,'sally has afriend',20])。所以我基本上想确保所有句子在解析时都分配给一个“键”。这很复杂抱歉。这也是为什么 John 的早期解决方案适用于更简单的情况。

最佳答案

from itertools import chain
list(chain(*[[[y[0],z] for z in y[1].split('. ')] for y in x]))

产生

[['2925729', 'Patrick came outside and greeted us promptly.'],
 ['2925729', 'Patrick did not shake our hands nor ask our names'],
 ['2925729', 'He greeted us promptly and politely, but it seemed routine.'],
 ['2925728', 'Patrick sucks'],
 ['2925728', 'He farted politely, but it seemed routine.']]

list(chain(*...)) 展平由 [[[y[0],z] for z in y[1].split(' 生成的嵌套列表.')] for y in x].

如果您想“就地”更改列表，您可以使用

xl = list(x) # you gave us a tuple          
for i,y in enumerate(xl):
    xx = xl[i]
    xx = [[xx[0],y] for y in xx[1].split('. ')]
    xl[i:i+1] = xx

当数据集非常大时，我不确定哪个会更快或更好。

关于python - 将列表转换为子列表，同时保留 "key"，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20861203/

python - 将列表转换为子列表，同时保留 "key"

上一篇：Python Facebook 'NoneType object error'

下一篇：python - 使用 PIL 水平和垂直对齐文本