python - 如何将句子转换为向量

我有一本字典，其中键是单词，值是这些单词的向量。我有一个句子列表，我想将其转换为数组。我得到了所有单词的数组，但我想要一个带有单词向量的句子数组，这样我就可以将其输入神经网络

sentences=["For last 8 years life, Galileo house arrest espousing man's theory",
           'No. 2: 1912 Olympian; football star Carlisle Indian School; 6 MLB seasons Reds, Giants & Braves',
           'The city Yuma state record average 4,055 hours sunshine year'.......]    

word_vec={'For': [0.27452874183654785, 0.8040047883987427],
         'last': [-0.6316165924072266, -0.2768899202346802],
         'years': [-0.2496756911277771, 1.243837594985962],
         'life,': [-0.9836481809616089, -0.9561406373977661].....}

我想将上面的句子转换成字典中对应单词的向量。

最佳答案

试试这个:

def sentence_to_list(sentence, words_dict):
    return [w for w in sentence.split() if w in words_dict]

因此示例中的第一个句子将转换为:

['For', 'last', 'years', 'life']  # words not in the dictionary are not present here

更新。

我想你需要删除标点符号。有多种方法可以使用多个分隔符来分割字符串，请检查这个答案:Split Strings into words with multiple word boundary delimiters

关于python - 如何将句子转换为向量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55684956/

上一篇：python - 从Python中的列中删除单词

下一篇：python - 如何修复Python中的TreeTagger : parameter file invalid : english. par

Python 给出冗长的输出，表明模块正在被销毁

python /Matplotlib : adding regression line to a plot given its intercept and slope

python - 将 dask 数据帧保存到 csv 时如何纠正错误？

Python 脚本通过双击和 IDLE 运行，但不通过 Windows CMD shell 运行

python - 如何读取文件的第一行两次？

python - 按特定时间有效切片 pandas 日期时间索引

Python 切片和替换

python - 尝试访问 Azure Databricks 中的 Azure DBFS 文件系统时出现装载错误

python - 用sklearn对弧度距离矩阵进行DBSCAN？