python - 如何将字符串转换为同一列中有空格的DataFrame

标签 python pandas dataframe

下面是一个示例字符串。我如何将这个字符串转换为 pandas Dataframe?

   str1 =
    """
    Feature Id & Feature Desc                             Status   Failed Total 
    ---------------------------------------------------   -------- ------ -----
    RKSPACE (RackSpace Test In)                           Passed   0      1     
    D1 (Drum 1 Test)                                      Passed   0      1     
    D2 (Drum 2 Test)                                      Passed   0      1     
    D3 (Drum 3 Test)                                      Passed   0      1     
    PRIMUS (PRIMUS Ink Test)                              Not-run  0      0     
    RGB (RGB Color Test)                                  Passed   0      1     
    YONO (App Test)                                       Not-run  0      0     
    PSENSE (Paper Sensor Test)                            Not-run  0      0     
    TFlag (Flag Test)                                     Not-run  0      0     
    MEMT (Memory Test)                                    Passed   0      1     
    CRG (CARRIAGE Test)                                   Not-run  0      0    
    """

我试过下面的代码

    import pandas as pd
    from StringIO import StringIO        
    def get_dataframe(str1):
        test_data = StringIO(str1)
        df = pd.read_csv(test_data, sep=r'\s+', comment='--', engine='python')
        return df

我得到的结果很难看而且不正确。 Result Image 我检查了其他帖子,但没有发现任何处理字符串中空格的问题。 通常,如果第一列中没有空格,这将很容易获得 Dataframe,但是如何将其转换为保留与 str1 相同格式的 DataFrame? 任何帮助,将不胜感激 。谢谢

最佳答案

您可以使用 read_fwf :

str1 = """
Feature Id & Feature Desc                             Status   Failed Total 
---------------------------------------------------   -------- ------ -----
RKSPACE (RackSpace Test In)                           Passed   0      1     
D1 (Drum 1 Test)                                      Passed   0      1     
D2 (Drum 2 Test)                                      Passed   0      1     
D3 (Drum 3 Test)                                      Passed   0      1     
PRIMUS (PRIMUS Ink Test)                              Not-run  0      0     
RGB (RGB Color Test)                                  Passed   0      1     
YONO (App Test)                                       Not-run  0      0     
PSENSE (Paper Sensor Test)                            Not-run  0      0     
TFlag (Flag Test)                                     Not-run  0      0     
MEMT (Memory Test)                                    Passed   0      1     
CRG (CARRIAGE Test)                                   Not-run  0      0    
"""

df = pd.read_fwf(pd.compat.StringIO(str1), 
                 colspecs=[(0, 50), (51, 62), (63, 69), (70, 76)], 
                 skiprows=[2],
                 header=[1])
print (df)
      Feature Id & Feature Desc   Status  Failed  Total
0   RKSPACE (RackSpace Test In)   Passed       0      1
1              D1 (Drum 1 Test)   Passed       0      1
2              D2 (Drum 2 Test)   Passed       0      1
3              D3 (Drum 3 Test)   Passed       0      1
4      PRIMUS (PRIMUS Ink Test)  Not-run       0      0
5          RGB (RGB Color Test)   Passed       0      1
6               YONO (App Test)  Not-run       0      0
7    PSENSE (Paper Sensor Test)  Not-run       0      0
8             TFlag (Flag Test)  Not-run       0      0
9            MEMT (Memory Test)   Passed       0      1
10          CRG (CARRIAGE Test)  Not-run       0      0

感谢@gyoza 简化解决方案:

df = pd.read_fwf(pd.compat.StringIO(str1), 
                 skiprows=[2],
                 header=[1])
print (df)
      Feature Id & Feature Desc   Status  Failed  Total
0   RKSPACE (RackSpace Test In)   Passed       0      1
1              D1 (Drum 1 Test)   Passed       0      1
2              D2 (Drum 2 Test)   Passed       0      1
3              D3 (Drum 3 Test)   Passed       0      1
4      PRIMUS (PRIMUS Ink Test)  Not-run       0      0
5          RGB (RGB Color Test)   Passed       0      1
6               YONO (App Test)  Not-run       0      0
7    PSENSE (Paper Sensor Test)  Not-run       0      0
8             TFlag (Flag Test)  Not-run       0      0
9            MEMT (Memory Test)   Passed       0      1
10          CRG (CARRIAGE Test)  Not-run       0      0

关于python - 如何将字符串转换为同一列中有空格的DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52658676/

相关文章:

python - 取 pandas 数据框中一系列行的平均值

python - 在Python中计算列表中正数的总和

python - Node.js 从 stdin 读取时无法读取 python 子进程 stdout

python - 如何为 Pandas 数据框列中的每个唯一值添加重复的月份行?

python - 计算日志变化的替代方法会产生不同的结构

python - 在对象调用中使用变量

python - Pandas :在包含列表的一列中找到最大值

python:xlsxwriter将数据框+公式添加到excel文件

python - 根据条件突出显示 panda df 错误

python - 将函数应用于 pandas 中分组数据的单列