php - Doctrine 插入很多数据

标签 php mysql symfony doctrine-orm doctrine

我正在努力在许多 csv 中导入 300000 行。

首先,我获取 csv 并将每一行导入到我的数据库中的一个表中。

在我想解析所有行并插入到与该数据有某种关系的正确表中之后。

所以我试过这个:

    $qb = $this->entityManager->createQueryBuilder();
    $flows = $qb->select('flow')
        ->from('AppBundle:FlowAndata', 'flow')
        ->getQuery()
        ->getResult();

    $countRows = 0;
    foreach ($flows as $row) {
         //some check 
         $entity = new TestTable();
         $entity->setCode($row->getCode());
         //many other fields
         $this->entityManager->persist($entity);
         $this->entityManager->flush();
    }

在这种情况下,每一行的所有过程大约需要 5 秒!

现在如果我像这样添加 setMaxResults:

    $qb = $this->entityManager->createQueryBuilder();
    $flows = $qb->select('flow')
        ->from('AppBundle:FlowAndata', 'flow')
        ->setMaxResults(100)
        ->getQuery()
        ->getResult();

不到 1 秒!

所以我想获取所有行并将其拆分为一个带有 setMaxResult 的递归函数,如下所示:

    $qb = $this->entityManager->createQueryBuilder();
    $flows = $qb->select('flow')
        ->from('AppBundle:FlowAndata', 'flow')
        ->getQuery()
        ->getResult();

    $countFlows = count($flows);
    $numberOfQuery = $countFlows / 100;

    for ($i = 0; $i <= $numberOfQuery; $i++) {
         $this->entityManager->clear();
         $qb = $this->entityManager->createQueryBuilder();
         $flows = $qb->select('flow')
            ->from('AppBundle:FlowAndata', 'flow')
            ->setFirstResult($i * 100)
            ->setMaxResults(100)
            ->getQuery()
            ->getResult();

    }

通过这种方式,我创建了许多拆分为 100 行的查询。 解析多行并插入它是一种好的做法还是有更好的方法?

最佳答案

official documentation of Doctrine推荐的高效方式正在利用 EntityManager 的事务性后写行为。

为数据处理迭代大型结果

You can use the iterate() method just to iterate over a large result and no UPDATE or DELETE intention. The IterableResult instance returned from $query->iterate() implements the Iterator interface so you can process a large result without memory problems using the following approach. (See example)

批量插入

Bulk inserts in Doctrine are best performed in batches, taking advantage of the transactional write-behind behavior of an EntityManager. [...] You may need to experiment with the batch size to find the size that works best for you. Larger batch sizes mean more prepared statement reuse internally but also mean more work during flush. (See example)

混合两种技术的版本(实体存储库内部):

$q = $this->_em->createQuery('SELECT f FROM AppBundle:FlowAndata f');
$iterableResult = $q->iterate();

$i = 0;
$batchSize = 100;

foreach ($iterableResult as $row) {
    // do stuff with the data in the row, $row[0] is always the object 
    /** @var AppBundle\Entity\FlowAndata $flow */
    $flow = $row[0];

    //some check 
    $entity = new TestTable();
    $entity->setCode($row->getCode());
    //many other fields

    $this->_em->persist($entity);

    $i++;
    if (($i % $batchSize) === 0) {
        $this->_em->flush();
        // Detaches all objects from Doctrine!
        $this->_em->clear(); 
    } else {
        // detach from Doctrine, so that it can be Garbage-Collected immediately
        $this->_em->detach($flow);
    }
}

$this->_em->flush(); //Persist objects that did not make up an entire batch
$this->_em->clear();

关于php - Doctrine 插入很多数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45106417/

相关文章:

php - 解析 REST API : Having the channel name, 我可以在实际发送到 Push 之前获取设备类型吗?

javascript - 单独显示 json_encode 结果

php - 使用 PDO 获取最后插入的行 ID(不合适的结果)

php - Codeigniter A数据库发生错误错误号: 1064

mysql - 如何获取当前时间戳的纬度和经度?

mysql - 在 "key-value"表上查询和排序

php - 如何使用 symfony 5 摆脱国际弃用消息

php - CodeIgniter db->select() 奇怪的行为

php - 如何使用 Sonata 列表制作一个弹出窗口来选择一个项目

php - Symfony 3,DI - 将服务添加到参数