python - 如何将多个单词列表转换为 Pandas 数据框?

标签 python python-3.x csv pandas

我有一个 .txt 文件,其中包含如下单词列表:

5.91686268506 exclusively, catering, provides, arms, georgia, formal, purchase, choose
5.91560417296 hugh, senlis
5.91527936181 italians
5.91470429433 soil, cultivation, fertile
5.91468087491 increases, moderation
....
5.91440227412 farmers, descendants

我想将此类数据转换为 pandas 表,我希望将其显示为 html/bootstrap 模板,如下所示 (*):

COL_A         COL_B
5.91686268506 exclusively, catering, provides, arms, georgia, formal, purchase, choose
5.91560417296 hugh, senlis
5.91527936181 italians
5.91470429433 soil, cultivation, fertile
5.91468087491 increases, moderation
....
5.91440227412 farmers, descendants

所以我用 pandas 尝试了以下方法:

import pandas as pd
df = pd.read_csv('file.csv', 
                 sep = ' ', names=['Col_A', 'Col_B'])
df.head(20)

但是,我的表没有上述所需的结构:

                                                                                                                                COL_A   COL_B
6.281426    engaged,    chance,     makes,  meeting,    nations,    things,     believe,    tries,  believing,  knocked,    admits,     awkward
6.277438    sweden  NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
6.271190    artificial,     ammonium    NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
6.259790    boats,  prefix  NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
6.230612    targets,    tactical,   wing,   missile,    squadrons   NaN     NaN     NaN     NaN     NaN     NaN     NaN

知道如何获取 (*) 表格格式的数据吗?

最佳答案

因为你在单词之间有空格,如果你指定空格作为分隔符,它会自然地将它们分开。为了得到你需要的,你可以尝试设置 sep作为正则表达式 (?<!,) , ?<!是语法背后的负面看法,这意味着只有当它前面没有逗号并且它应该适用于您的情况时才在空格上分开:

pd.read_csv("~/test.csv", sep = "(?<!,) ", names=['weight', 'topics'])

#     weight    topics
#0  5.916863    exclusively, catering, provides, arms, georgia...
#1  5.915604    hugh, senlis
#2  5.915279    italians
#3  5.914704    soil, cultivation, fertile
#4  5.914681    increases, moderation
#5  5.914402    farmers, descendants

关于python - 如何将多个单词列表转换为 Pandas 数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39339560/

相关文章:

javascript - Highcharts csv 数据加载并解析为单独的数组和 json

python - Matplotlib 图例中的自定义艺术家

python - Pygame 文本优化

python - 如何更改 CSV 文件中使用的分隔符?

python-3.x - 无法将数据帧绘制为 barh,因为 TypeError : Empty 'DataFrame' : no numeric data to plot

python - Conemu - 重用实例,但不将其带到前台

csv - 如何在 awk 脚本中传递命令行参数

python - Pillow 模块 - 裁剪和保存时色调发生变化(无转换)

python - Ansible Python API - 如何获取在 vars_files、host_vars、group_vars 中定义的变量

python - 来自太阳的运动检测照明