我正在从文本文件读取数据并将其存储在 MySql 数据库中。文本文件中的数据格式为
"sd","modu","datfil","244","de3000.Std.27","CPU:KC.EBM1_16.02.18.cd","13","ffm"
"TIMESTAMP","RNO","tem","t_vel","tem_acc","velc","sd","sd_ds_as"
"","","mp","mp","mp","mp","mp","mp"
"2009-02-25 14:28:36.76",missing RNO,8.277527,0.68,0.15,0.42,762.0303,4.6801
"2009-02-25 14:28:36.8",missing RNO,8.24408,0.7,0.03,0.3,761.878,4.682412
"2009-02-25 14:29:36.88",2,8.277527,0.55,0.09,0.31,762.0018,4.680709
"2009-02-25 14:30:36.92",3,8.277527,0.47,0.2,0.31,761.8914,4.684526
"2009-02-25 14:48:36.96",4,8.277527,0.46,0.14,0.28,761.9133,4.692356
"2009-02-25 14:58:37",5,8.210632,0.42,0.09,0.35,761.9025,4.696963
"2009-02-25 14:58:37.08",6,8.277527,0.51,0.19,0.27,761.8416,4.69718
"2009-02-25 14:58:37.12",7,8.277527,0.36,0.23,0.33,761.7534,4.701172
"2009-02-25 14:58:37.16",8,8.24408,0.44,0.08,0.5,761.8087,4.700504
问题是时间戳包含在“”中,并且某些数据值包含毫秒。为了解决这个问题,我编写了以下代码
with open(filepath) as f:
lines = f.readlines()
max_lines = len(lines)
for k, line in enumerate(lines):
if k >= (int(skip_header_line) + int(index_line_number)): # skipping headerlines
data_tmp = line.split(',')
strDate = data_tmp[0].replace("\"", "") # 2016-02-25 14:48:36.76
strDate = strDate.split('.')[0] # 2016-02-25 14:48:36
timestamp = datetime.datetime.strptime(strDate, '%Y-%m-%d %H:%M:%S') # 2016-02-25 14:48:36
ts = calendar.timegm(timestamp.timetuple()) # 1456411716
data_buffer = [ts] + data_tmp[1:]
for val in data_buffer:
if val == " ":
val = None
data_buffer.append(val)
else:
continue
print data_buffer
cursor.execute(add_data, data_buffer)
cnx.commit()
with open(marker_file, "w") as f:
f.write(" ".join([ str(item[0]), str(data_tmp[0]), str(max_lines), str(k-int(skip_header_line)+1) ]))
cursor.close()
cnx.close()
我收到以下错误
[1456411716, ' ', '8.277527', '0.68', '0.15', '0.42', '762.0303', '4.6801\n']
mysql.connector.errors.DatabaseError: 1265 (01000): Data
truncated for column 'RNO' at row 1
如果有人知道如何处理这种情况。我会非常感激。
最佳答案
我不确定,如果我理解你的问题,你不应该循环从calendar.timegm函数返回的时间戳。
ts = calendar.timegm(timestamp.timetuple()) # 1456411716
data_buffer = []
for val in ts:
稍微修改您的代码:
lines = '''
"2009-02-25 14:28:36.76",0,8.277527,0.68,0.15,0.42,762.0303,4.6801
"2009-02-25 14:28:36.8",1,8.24408,0.7,0.03,0.3,761.878,4.682412
"2009-02-25 14:29:36.88",2,8.277527,0.55,0.09,0.31,762.0018,4.680709
"2009-02-25 14:30:36.92",3,8.277527,0.47,0.2,0.31,761.8914,4.684526
"2009-02-25 14:48:36.96",4,8.277527,0.46,0.14,0.28,761.9133,4.692356
"2009-02-25 14:58:37",5,8.210632,0.42,0.09,0.35,761.9025,4.696963
"2009-02-25 14:58:37.08",6,8.277527,0.51,0.19,0.27,761.8416,4.69718
"2009-02-25 14:58:37.12",7,8.277527,0.36,0.23,0.33,761.7534,4.701172
"2009-02-25 14:58:37.16",8,8.24408,0.44,0.08,0.5,761.8087,4.700504
'''
skip_header_line = 0
index_line_number = 0
if 1:
lines = lines.splitlines()
for k, line in enumerate(lines):
if k <= (int(skip_header_line) + int(index_line_number)):
continue
data_tmp = line.split(',')
strDate = data_tmp[0].replace("\"", "") # 2016-02-25 14:48:36.76
strDate = strDate.split('.')[0] # 2016-02-25 14:48:36
timestamp = datetime.datetime.strptime(strDate, '%Y-%m-%d %H:%M:%S') # 2016-02-25 14:48:36
ts = calendar.timegm(timestamp.timetuple()) # 1456411716
# rebuild list, first element is ts the others from data_tmp (excluding the datetime)
data_buffer = [ts] + data_tmp[1:]
print data_buffer
# here your insert ?
结果如下:
[1235572116, '0', '8.277527', '0.68', '0.15', '0.42', '762.0303', '4.6801']
[1235572116, '1', '8.24408', '0.7', '0.03', '0.3', '761.878', '4.682412']
[1235572176, '2', '8.277527', '0.55', '0.09', '0.31', '762.0018', '4.680709']
[1235572236, '3', '8.277527', '0.47', '0.2', '0.31', '761.8914', '4.684526']
[1235573316, '4', '8.277527', '0.46', '0.14', '0.28', '761.9133', '4.692356']
[1235573917, '5', '8.210632', '0.42', '0.09', '0.35', '761.9025', '4.696963']
[1235573917, '6', '8.277527', '0.51', '0.19', '0.27', '761.8416', '4.69718']
[1235573917, '7', '8.277527', '0.36', '0.23', '0.33', '761.7534', '4.701172']
[1235573917, '8', '8.24408', '0.44', '0.08', '0.5', '761.8087', '4.700504']
关于python - 在 MySql 中存储文本文件数据,时间戳在引号 ""中,并且有一些缺失的数据值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40982893/