PHPExcel 1.8.0 内存在加载、 block 读取和迭代器时耗尽

标签 php excel memory

我已经将我的 php 配置设置设置为上传 12800M 文件、无限大小的文件和无限上传时间以进行测试,但我一直卡在这个常见的 PHPExcel 致命内存耗尽错误上。我收到以下常见错误消息:

fatal error :在第 1220 行的 .../Worksheet.php 中允许的 536870912 字节的内存大小已耗尽(试图分配 32 字节)。

当我使用 block 读取过滤器或迭代器时,随着 .xlsx 文件被进一步读取到行中,内存使用量会增加,而不是像我从 PHPExcel 开发人员那里读到的一些人所报告的那样保持不变。

我使用的是 PHPExcel 1.8.0。我可能会尝试旧版本,因为我读到读取大文件的性能更好。我从定期加载文件并将其读入数组开始,使用示例中的 block 读取过滤,以及此 URL 的迭代器:PHPExcel - memory leak when I go through all rows .我认为它是 1.8.0 或更早的版本并不重要。

<?php

/** Error reporting */
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);
date_default_timezone_set('America/Los_Angeles');

define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

/** Include PHPExcel_IOFactory */
require_once dirname(__FILE__) . '/../Classes/PHPExcel/IOFactory.php';


if (!file_exists("Test.xlsx")) {
    exit("Please check if Test.xlsx exists first.\n");
}

echo date('H:i:s') , " Load workbook from Excel5 file" , EOL;
$callStartTime = microtime(true);

$objPHPExcel = PHPExcel_IOFactory::load("Order_Short.xls");

$callEndTime = microtime(true);
$callTime = $callEndTime - $callStartTime;

echo 'Call time to load Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;
// Echo memory usage
echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;


//http://runnable.com/Uot2A2l8VxsUAAAR/read-a-simple-2007-xlsx-excel-file-for-php

//  Read your Excel workbook
$inputFileName="Drybar Client Data for Tableau Test.xlsx";
$table = "mt_company2_project2_table144_raw";

try {
    echo date('H:i:s') , " Load workbook from Excel5 file" , EOL;
    $callStartTime = microtime(true);

    $inputFileType = PHPExcel_IOFactory::identify($inputFileName);
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    $objPHPExcel = $objReader->load($inputFileName);

    $callEndTime = microtime(true);
    $callTime = $callEndTime - $callStartTime;

    echo 'Call time to load Workbook was ' , sprintf('%.4f',$callTime) , " seconds\r\n" , EOL;
    // Echo memory usage
    echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;
} catch (Exception $e) {
    die('Error loading file "' . pathinfo($inputFileName, PATHINFO_BASENAME)
        . '": ' . $e->getMessage());
}

// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);

// http://asantillan.com/php-excel-import-to-mysql-using-phpexcel/
$worksheetTitle = $sheet->getTitle();

$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);

// Calculationg Columns
$nrColumns = ord($highestColumn) - 64;

echo "File ".$worksheetTitle." has ";
echo $nrColumns . ' columns';
echo ' x ' . $highestRow . ' rows.<br />';

//  Loop through each row of the worksheet in turn
for ($row = 1; $row <= $highestRow; $row++) {
    //  Read a row of data into an array
    $rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row,
                                    NULL, FALSE, FALSE);

    var_dump($rowData);

    foreach ($rowData[0] as $k => $v) {
        echo "Row: " . $row . "- Col: " . ($k + 1) . " = " . $v . "<br />";
    }

}

我包括了从 PHPExcel Reader 示例 #12 修改的 block 读取过滤器,它仍然给我相同的致命内存耗尽,因为它的内存使用量仍在增加,因为它继续向下读取行?

<?php

error_reporting(E_ALL);
set_time_limit(0);

define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

date_default_timezone_set('America/Los_Angeles');

/**  Set Include path to point at the PHPExcel Classes folder  **/
set_include_path(get_include_path() . PATH_SEPARATOR . '../../../Classes/');

/**  Include PHPExcel_IOFactory  **/
include 'PHPExcel/IOFactory.php';

$inputFileName = 'Test.xlsx';


/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;

    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */
    public function setRows($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow      = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
            return true;
        }
        return false;
    }
}

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);

$callStartTime = microtime(true);

echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';

// Call time

$callEndTime = microtime(true);
$callTime = $callEndTime - $callStartTime;
echo 'Call time to read Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;

// Echo memory usage
echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;

/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);

echo '<hr />';


/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize = 100;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();

/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($chunkFilter);
/**  Loop to read our worksheet in "chunk size" blocks  **/
for ($startRow = 2; $startRow <= 26000; $startRow += $chunkSize) {
    echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startRow,' to ',($startRow+$chunkSize-1),'<br />';
    /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/
    $chunkFilter->setRows($startRow,$chunkSize);
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
    $objPHPExcel = $objReader->load($inputFileName);

    //  Do some processing here

    $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);

    echo '<br /><br />';

    // Call time

    $callEndTime = microtime(true);
    $callTime = $callEndTime - $callStartTime;
    echo 'Call time to read Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;

    // Echo memory usage
    echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;

    // Echo memory peak usage
    echo date('H:i:s') , " Peak memory usage: " , (memory_get_peak_usage(true) / 1024 / 1024) , " MB" , EOL;
    echo '<hr />';
}

?>
<body>
</html>

最佳答案

你这里确实有一个主要问题:

$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);

这将尝试为工作表的整个大小构建一个数组,无论您是否加载 block ...... toArray() 根据您正在加载的文件使用电子表格的预期大小,而不是您正在加载的过滤后的单元格集

尝试使用 rangeToArray() 仅获取您通过 block 加载的单元格范围

$sheetData = $objPHPExcel->getActiveSheet()
    ->rangeToArray(
        'A'.$startRow.':'.$objPHPExcel->getActiveSheet()->getHighestColumn().($startRow+$chunkSize-1),
        null,
        true,
        true,
        true
    );

即便如此,在内存中构建 PHP 数组也会占用大量内存;如果代码可以一次处理一行工作表数据而不是填充一个大数组,那么它对内存的需求就会少很多

关于PHPExcel 1.8.0 内存在加载、 block 读取和迭代器时耗尽,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30359130/

相关文章:

java.lang.OutOfMemory错误: Java heap space when reading from a txt file and writing to an xlsx file

memory - 谷歌计算引擎 : increase memory/CPU of the instance

java - 如何解决java中无法实例化类型Workbook?

Excel VBA - 检查工作簿中是否存在特定工作表和命名单元格的函数

php - 跟踪最后访问的页面

php - 为 Ajax 页面添加书签

java - 如何在不使用 Apache POI 解析大型 excel 文件的情况下复制整个工作表?

c++ - 如果我想重新分配内存并使用32(或64)字节对齐,我有什么选择?

php - Wordpress/PHP - 做一个简码

php - 授予mysql远程访问权限