我正在尝试使用 csv.Dictreader()
将大型 CSV 文件 (1gb) 导入 MySQL 数据库,但花费的时间太长。您有什么建议可以使其解析和上传更快吗?
这是我的代码示例:
def process_data(self):
f = io.TextIOWrapper(self.cleaned_data['data_file'].file, encoding='utf-8-sig')
reader = csv.DictReader(f)
for row in reader:
Csv.objects.create(starttime=datetime.strptime(row["startTime"], '%Y-%m-%d %H:%M:%S'),
incidents_id=(row['id']), type=(row['type']),
subtype=(row['subtype']), reportdescription=(row['reportDescription']),
street=(row['street']),
reportby=(row['reportBy']),
longitude=Decimal(row['longitude']), latitude=Decimal(row['Latitude']),
endtime=datetime.strptime(row["endTime"], '%Y-%m-%d %H:%M:%S'), dataowner_id=1)
这是我的模型
class Csv(models.Model):
starttime = models.DateTimeField(blank=True, null=True)
type = models.CharField(max_length=50, blank=True, null=True)
subtype = models.CharField(max_length=50, blank=True, null=True)
reportdescription = models.CharField(max_length=255, blank=True, null=True)
street = models.CharField(max_length=150, blank=True, null=True)
reportby = models.CharField(max_length=50, blank=True, null=True)
longitude = models.DecimalField(max_digits=12, decimal_places=8)
latitude = models.DecimalField(max_digits=12, decimal_places=8)
endtime = models.DateTimeField(blank=True, null=True)
incidents_id = models.CharField(max_length=150, blank=True, null=True)
dataowner = models.ForeignKey('Dataowner', models.DO_NOTHING)
def __str__(self):
return self.type
class Meta:
managed = False
db_table = 'csv'
最佳答案
我不确定你的模型结构,使用哪些键、索引和文件大小,但我建议你看看批量创建 ORM 函数,它应该可以提高性能。
https://docs.djangoproject.com/en/3.0/ref/models/querysets/#bulk-create
bulk_create(objs, batch_size=None, ignore_conflicts=False)
This method inserts the provided list of objects into the database in an efficient manner (generally only 1 query, no matter how many objects there are):
>>> Entry.objects.bulk_create([ ... Entry(headline='This is a test'), ... Entry(headline='This is only a test'), ... ])
关于python - 我想解析一个大的 csv 文件并将其上传到 mysql 数据库中,但它需要很长时间。 Python/Django,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59275659/