python pd.read 在运行 rodeo 时标记错误

我查看了python文档，到目前为止这段代码没有任何问题。但是，当我在 Rodeo IDE 上运行此代码时，它会将其标记为错误，有人可以指导我正确的方向或发布解决此问题的答案。

我的代码:

#%matplotlib inline
import pandas as pd

import numpy as np
import matplotlib.pyplot as plt

#pd.set_option("display.max_rows", 16)

#LARGE_FIGSIZE = (12, 8)

#C:\\Users\\User\\Documents\\dataScience\\pandas_tutorial\\climate_timeseries\\data\\temperatures\\xxxx.txt"

#https://github.com/jonathanrocher/pandas_tutorial/blob/master/climate_timeseries/data/temperatures/GLB.Ts+dSST.txt

giss_temp = pd.read_table("C:\\Users\\User\\Documents\\dataScience\\pandas_tutorial\\climate_timeseries\\data\\temperatures\\xxxx.txt",sep="\s+",skiprows=7,skipfooter=7,engine="python")

print(giss_temp)

我的错误消息:

ValueError: Expected 2 fields in line 160, saw 87
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-31-02b52eb96870> in <module>()
     11 #C:\\Users\\User\\Documents\\dataScience\\pandas_tutorial\\climate_timeseries\\data\\temperatures\\xxxx.txt"
     12 #https://github.com/jonathanrocher/pandas_tutorial/blob/master/climate_timeseries/data/temperatures/GLB.Ts+dSST.txt
---> 13 giss_temp = pd.read_table("https://github.com/jonathanrocher/pandas_tutorial/blob/master/climate_timeseries/data/temperatures/GLB.Ts+dSST.txt",sep="\s+",skiprows=7,skipfooter=7,engine="python")
     14 print(giss_temp)
C:\Users\User\AppData\Local\rodeo\app-2.5.2\resources\conda\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
    643                     skip_blank_lines=skip_blank_lines)
    644 
--> 645         return _read(filepath_or_buffer, kwds)
    646 
    647     parser_f.__name__ = name
C:\Users\User\AppData\Local\rodeo\app-2.5.2\resources\conda\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
    398         return parser
    399 
--> 400     data = parser.read()
    401     parser.close()
    402     return data
C:\Users\User\AppData\Local\rodeo\app-2.5.2\resources\conda\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
    936                 raise ValueError('skipfooter not supported for iteration')
    937 
--> 938         ret = self._engine.read(nrows)
    939 
    940         if self.options.get('as_recarray'):
C:\Users\User\AppData\Local\rodeo\app-2.5.2\resources\conda\lib\site-packages\pandas\io\parsers.py in read(self, rows)
   1990             content = content[1:]
   1991 
-> 1992         alldata = self._rows_to_cols(content)
   1993         data = self._exclude_implicit_index(alldata)
   1994 
C:\Users\User\AppData\Local\rodeo\app-2.5.2\resources\conda\lib\site-packages\pandas\io\parsers.py in _rows_to_cols(self, content)
   2505             msg = ('Expected %d fields in line %d, saw %d' %
   2506                    (col_len, row_num + 1, zip_len))
-> 2507             raise ValueError(msg)
   2508 
   2509         if self.usecols:
ValueError: Expected 2 fields in line 160, saw 87
>>>  
ClearInterruptRestart

最佳答案

您应该使用原始文件 URL:

In [390]: pd.options.display.max_rows = 10

In [391]: url = 'https://raw.githubusercontent.com/jonathanrocher/pandas_tutorial/master/climate_timeseries/data/temperatures/GLB.Ts%2BdSST.txt'

In [392]: pd.read_csv(url, skiprows=7, delim_whitespace=True, skipfooter=12, error_bad_lines=False, engine='python')
Out[392]:
     Year  Jan  Feb  Mar  Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec   J-D  D-N   DJF   MAM   JJA   SON Year.1
0    1880  -34  -27  -22  -30   -16   -24   -19   -12   -20   -19   -16   -21   -22  ***  ****   -23   -18   -18   1880
1    1881  -13  -16   -2   -3    -3   -27   -12    -8   -18   -23   -28   -18   -14  -14   -17    -3   -15   -23   1881
2    1882    3    4   -2  -24   -20   -32   -27   -11   -11   -25   -25   -37   -17  -16    -4   -15   -23   -20   1882
3    1883  -38  -38  -12  -20   -20    -8    -3   -13   -19   -19   -28   -21   -20  -21   -38   -18    -8   -22   1883
4    1884  -20  -14  -31  -36   -33   -36   -31   -24   -29   -25   -29   -25   -28  -28   -18   -33   -31   -28   1884
..    ...  ...  ...  ...  ...   ...   ...   ...   ...   ...   ...   ...   ...   ...  ...   ...   ...   ...   ...    ...
137  2011   45   44   57   60    47    54    70    69    52    60    50    48    55   55    45    55    64    54   2011
138  2012   38   43   52   62    71    59    50    56    68    73    69    46    57   57    43    62    55    70   2012
139  2013   62   52   60   48    56    61    53    61    73    61    75    61    60   59    53    55    58    70   2013
140  2014   68   44   71   72    79    62    50    74    81    78    64    74    68   67    58    74    62    74   2014
141  2015   75   80   84   71  ****  ****  ****  ****  ****  ****  ****  ****  ****  ***    76  ****  ****  ****   2015

[142 rows x 20 columns]

关于python pd.read 在运行 rodeo 时标记错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48254732/

python pd.read 在运行 rodeo 时标记错误

上一篇：python - 客户名单登记

下一篇：python - 内存错误 Numpy/Python 欧几里得距离