python - 如何在 Pandas 中读取带有引号字符和逗号的 CSV 文件?

标签 python pandas csv

我有一个这样的数据集:

ISIN,"MIC","Datum","Open","Hoog","Laag","Close","Number of Shares","Number of Trades","Turnover","Valuta"
NL0011821202,"Euronext Amsterdam Brussels","04/09/2017","14.82","14.95","14.785","14.855","7482805","6970","111345512.83","EUR"
NL0011821202,"Euronext Amsterdam Brussels","05/09/2017","14.91","14.92","14.585","14.655","15240971","12549","224265257.14","EUR"
NL0011821202,"Euronext Amsterdam Brussels","07/09/2017","14.69","14.74","14.535","14.595","15544695","15817","227478163.74","EUR"

但是我无法使用 pd.read_csv('filename.csv') 正确读取文件 我尝试过各种组合,例如:

 sep='"',
 delimiter=","

但是一点运气都没有! 我希望第一行是列,并且删除引号字符和逗号。

我如何有效地解决这个问题?

最佳答案

问题是有时有两个 ",解决方案是在 之前和之后更改匹配零个或多个 " 的分隔符:

df = pd.read_csv('ING_DAILY - ING_DAILY.csv',  sep='["]*,["]*', engine='python')

然后有必要从列名称以及第一列和最后一列中删除 ":

df.columns = df.columns.str.strip('"')
df.iloc[:, [0,-1]] = df.iloc[:, [0,-1]].apply(lambda x: x.str.strip('"'))
print (df.head(3))

           ISIN                          MIC       Datum   Open    Hoog  \
0  NL0011821202  Euronext Amsterdam Brussels  04/09/2017  14.82  14.950   
1  NL0011821202  Euronext Amsterdam Brussels  05/09/2017  14.91  14.920   
2  NL0011821202  Euronext Amsterdam Brussels  06/09/2017  14.69  14.725   

     Laag   Close  Number of Shares  Number of Trades      Turnover Valuta  
0  14.785  14.855           7482805              6970  1.113455e+08    EUR  
1  14.585  14.655          15240971             12549  2.242653e+08    EUR  
2  14.570  14.615          14851426             15303  2.175316e+08    EUR  

关于python - 如何在 Pandas 中读取带有引号字符和逗号的 CSV 文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52166302/

相关文章:

Python - 从具有多个类别的 csv 文件制作字典

perl - 使用 Text::CSV 编写输出是否有附加值?

python - 无法在 R 中导入 pandas

python - 使用 python unittest,我如何断言报告的错误给出了特定消息?

python - 为什么我得到 'ValueError: NaTType does not support strftime',即使它不为空?

python - 使用 HDF5 格式将 pandas 数据帧写入 S3

php - 使用 PHP 脚本将 CSV 导入 MySQL

python - 带标签转置日期时间

python - Keras 中局部连接层的维度

python - 每月总计和累计总和 - Pandas