我有一些文本行,然后是它们的相关权重。
Weight, Text
10, "I like apples"
20, "Someone needs apples"
是否可以获取组合,并将值保留在权重列中?像这样的东西:
weight, combinations
10, [I like]
10, [I apples]
10, [like apples]
20, [someone needs]
20, [someone apples]
20, [needs apples]
“从 Pandas 列生成 n-grams,同时保留另一列”(未解决)是一个类似的问题,但尚未解决。
谢谢!!!
最佳答案
from itertools import combinations
import pandas as pd
df = pd.DataFrame({'Weight': [10, 20],
'Text': ["I like apples", "Someone needs apples"]})
df['Combinations'] = df.Text.apply(lambda x : list(combinations(x.split(), 2)))
df = df.explode('Combinations')
df.drop('Text', axis=1, inplace=True)
print(df)
输出:
Weight Combinations
0 10 (I, like)
0 10 (I, apples)
0 10 (like, apples)
1 20 (Someone, needs)
1 20 (Someone, apples)
1 20 (needs, apples)
关于python - 类似于 "Generate n-grams from Pandas column while persisting another column"(未解决),但有值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65971168/