我想将以下 .csv 数据(.txt 文件)导入到每列数据的 python 列表中,忽略开头的文本。我无法更改文件的格式。我收到错误:
"Traceback (most recent call last):
File "/Users/Hamish/Desktop/Python/AWBM/Import.py", line 13, in <module>
rain_column = float(row[7])
IndexError: list index out of range"
这是我正在尝试运行的代码...
import csv
import numpy as np
file = open('Data_Bris.txt')
reader = csv.reader(file, delimiter=' ')
datelist = []
rainlist = []
evaplist = []
for row in reader:
# row = [date, day, date2, T.Max, Smx, T.Min, Smn, Rain, Srn, Evap, Sev, Rad, Ssl, VP, Svp, maxT, minT, Span, Ssp]
date_column = str(row[0])
rain_column = float(row[7])
evap_column = float(row[9])
datelist.append([date_column])
rainlist.append([rain_column])
evaplist.append([evap_column])
date = np.array([datelist])
rain = np.array([rainlist])
evap = np.array([evaplist])
timeseries = np.arange(rain.size)
这是我想要导入的数据文件(继续相同)...
"17701231" 365 31/12/1770 -99.9 999 -99.9 999 9999.9 999 999.9 999 999.9 999 999.9 999 9999.9 9999.9 9999.9 999
""
" This file is SPACE DELIMITED for easy import into both spreadsheets and programs."
"The first line 17701231 contains dummy data and is provided to allow spreadsheets to sense the columns"
" To read into a spreadsheet select DELIMITED and SPACE."
" "
" "
"========= The following essential information and notes should be kept in the data file =========="
" "
"The Data Drill system and data are copyright to the Queensland Government Department of Science, Information Technology and Innovation (DSITI)."
"SILO data, with the exception of Patched Point data for Queensland, are supplied to the licencee only and may not be given, lent, or sold to any other party"
" "
"Notes:"
" * Data Drill for Lat, Long: -27.5000 153.0000 (DECIMAL DEGREES), 27 30'S 153 00'E Your Ref: Data_Bris"
" * Elevation: 102m "
" * Extracted from Silo on 20171214"
" * Please read the documentation on the Data Drill at http://www.longpaddock.qld.gov.au/silo"
" "
" * As evaporation is read at 9am, it has been shifted to the day before"
" ie The evaporation measured on 20 April is in row for 19 April"
" * The 6 Source columns Smx, Smn, Srn, Sev, Ssl, Svp indicate the source of the data to their left, namely Max temp, Min temp, Rainfall, Evaporation, Radiation and Vapour Pressure respectively "
" "
" 35 = interpolated from daily observations using anomaly interpolation method for CLIMARC data
" 25 = interpolated daily observations, 75 = interpolated long term average"
" 26 = synthetic pan evaporation "
" "
" * Relative Humidity has been calculated using 9am VP, T.Max and T.Min"
" RHmaxT is estimated Relative Humidity at Temperature T.Max"
" RHminT is estimated Relative Humidity at Temperature T.Min"
" Span = a calibrated estimate of class A pan evaporation based on vapour pressure deficit and solar radiation
" * The accuracy of the data depends on many factors including date, location, and variable."
" For consistency data is supplied using one decimal place, however it is not accurate to that precision."
" Further information is available from http://www.longpaddock.qld.gov.au/silo"
"===================================================================================================="
" "
Date Day Date2 T.Max Smx T.Min Smn Rain Srn Evap Sev Radn Ssl VP Svp RHmaxT RHminT Span Ssp
(yyyymmdd) () (ddmmyyyy) (oC) () (oC) () (mm) () (mm) () (MJ/m2) () (hPa) () (%) (%) (mm) ()
18890101 1 1-01-1889 29.5 35 21.5 35 0.3 25 6.2 75 23.0 35 26.0 35 63.1 100.0 5.6 26
18890102 2 2-01-1889 32.0 35 21.5 35 0.1 25 6.2 75 23.0 35 21.0 35 44.2 81.9 6.9 26
18890103 3 3-01-1889 31.5 35 21.5 35 0.0 25 6.2 75 23.0 35 24.0 35 51.9 93.6 6.4 26
18890104 4 4-01-1889 29.5 35 21.0 35 0.0 25 6.2 75 23.0 35 22.0 35 53.4 88.5 6.1 26
18890105 5 5-01-1889 30.0 35 19.0 35 0.0 25 6.2 75 23.0 35 19.0 35 44.8 86.5 6.5 26
18890106 6 6-01-1889 28.5 35 18.5 35 0.0 25 6.2 75 23.0 35 23.0 35 59.1 100.0 5.7 26
18890107 7 7-01-1889 30.0 35 18.5 35 0.1 25 6.2 75 23.0 35 20.0 35 47.1 94.0 6.4 26
18890108 8 8-01-1889 28.0 35 18.5 35 0.0 25 6.2 75 23.0 35 21.0 35 55.6 98.7 5.8 26
18890109 9 9-01-1889 28.5 35 19.0 35 0.0 25 6.2 75 24.0 35 22.0 35 56.5 100.0 6.0 26
18890110 10 10-01-1889 29.0 35 20.0 35 0.0 25 6.2 75 23.0 35 21.0 35 52.4 89.9 6.1 26
最佳答案
在这里,您想要忽略标题中的所有行,包括列的名称和格式。实现此目的的一个简单方法是忽略任何不以数字开头的行。使用生成器(以避免加载内存中的所有文件),您可以简单地创建您的阅读器
:
...
reader = csv.reader((row for row in io.StringIO(t) if row[0].isdigit()),
delimiter=' ', skipinitialspace=True))
...
skipinitialspace=True
允许接受多个空格作为单个分隔符。
关于python - 将空格分隔的 .csv 导入 python3,忽略开头的文本?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47826517/