gpt4 book ai didi

PHPExcel 耗尽 256、512 和 1024MB RAM

转载 作者:IT王子 更新时间:2023-10-28 23:46:05 28 4
gpt4 key购买 nike

我不明白。 XSLX 表大约有 3MB 大,但即使是 1024MB 的 RAM 也不足以让 PHPExcel 将其加载到内存中吗?

我这里可能做错了什么:

function ReadXlsxTableIntoArray($theFilePath)
{
require_once('PHPExcel/Classes/PHPExcel.php');
$inputFileType = 'Excel2007';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($theFilePath);
$rowIterator = $objPHPExcel->getActiveSheet()->getRowIterator();
$arrayData = $arrayOriginalColumnNames = $arrayColumnNames = array();
foreach($rowIterator as $row){
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(false); // Loop all cells, even if it is not set
if(1 == $row->getRowIndex ()) {
foreach ($cellIterator as $cell) {
$value = $cell->getCalculatedValue();
$arrayOriginalColumnNames[] = $value;
// let's remove the diacritique
$value = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value);
// and white spaces
$valueExploded = explode(' ', $value);
$value = '';
// capitalize the first letter of each word
foreach ($valueExploded as $word) {
$value .= ucfirst($word);
}
$arrayColumnNames[] = $value;
}
continue;
} else {
$rowIndex = $row->getRowIndex();
reset($arrayColumnNames);
foreach ($cellIterator as $cell) {
$arrayData[$rowIndex][current($arrayColumnNames)] = $cell->getCalculatedValue();
next($arrayColumnNames);
}
}
}
return array($arrayOriginalColumnNames, $arrayColumnNames, $arrayData);
}

上述函数将数据从 excel 表读取到数组。

有什么建议吗?

起初,我允许 PHP 使用 256MB 的 RAM。这还不够。然后我将数量加倍,然后也尝试了 1024MB。它仍然因此错误而耗尽内存:

Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

Fatal error (shutdown): Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

最佳答案

PHPExcel 论坛上有很多关于 PHPExcel 内存使用的文章;因此,阅读之前的一些讨论可能会给您一些想法。 PHPExcel 保存电子表格的“内存中”表示,并且容易受到 PHP 内存限制的影响。

文件的物理大小在很大程度上是无关紧要的……更重要的是要知道它包含多少个单元格(每个工作表上的行*列)。

我一直使用的“经验法则”是平均大约 1k/单元格,因此 5M 单元格工作簿将需要 5GB 内存。但是,您可以通过多种方式降低该要求。这些可以结合使用,具体取决于您需要在工作簿中访问哪些信息,以及您希望用它做什么。

如果您有多个工作表,但不需要加载所有工作表,则可以使用 setLoadSheetsOnly() 方法限制 Reader 将加载的工作表。要加载单个命名工作表:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/
$objReader->setLoadSheetsOnly($sheetname);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);

或者您可以通过传递一组名称来一次调用 setLoadSheetsOnly() 来指定多个工作表:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3');
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/
$objReader->setLoadSheetsOnly($sheetnames);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);

如果您只需要访问工作表的一部分,那么您可以定义一个读取过滤器来识别您实际想要加载的单元格:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3';

/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class MyReadFilter implements PHPExcel_Reader_IReadFilter {
public function readCell($column, $row, $worksheetName = '') {
// Read rows 1 to 7 and columns A to E only
if ($row >= 1 && $row <= 7) {
if (in_array($column,range('A','E'))) {
return true;
}
}
return false;
}
}

/** Create an Instance of our Read Filter **/
$filterSubset = new MyReadFilter();
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load
It's more efficient to limit sheet loading in this manner rather than coding it into a Read Filter **/
$objReader->setLoadSheetsOnly($sheetname);
echo 'Loading Sheet using filter';
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($filterSubset);
/** Load only the rows and columns that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);

使用读取过滤器,您还可以以“ block ”的形式读取工作簿,这样在任何时候只有一个 block 驻留在内存中:

$inputFileType = 'Excel5'; 
$inputFileName = './sampleData/example2.xls';

/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
private $_startRow = 0;
private $_endRow = 0;

/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
}

public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
return true;
}
return false;
}
}

/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 20;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();
/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);

/** Loop to read our worksheet in "chunk size" blocks **/
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize);
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
// Do some processing here

// Free up some of the memory
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);
}

如果您不需要加载格式信息,而只需要加载工作表数据,则 setReadDataOnly() 方法将告诉读者仅加载单元格值,忽略任何单元格格式:

$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell data, not formatting **/
$objReader->setReadDataOnly(true);
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);

使用单元缓存。这是一种减少每个单元所需的 PHP 内存的方法,但会以速度为代价。它通过以压缩格式或在 PHP 内存(例如磁盘、APC、内存缓存)之外存储单元对象来工作...但是您节省的内存越多,脚本执行的速度就越慢。但是,您可以将每个单元格所需的内存减少到大约 300 字节,因此假设的 500 万个单元格将需要大约 1.4GB 的 PHP 内存。

单元缓存在开发者文档的 4.2.1 节中有描述

编辑

查看您的代码,您使用的是不是特别有效的迭代器,并构建了一个单元格数据数组。您可能想查看 toArray() 方法,它已内置到 PHPExcel 中,并为您完成此操作。也看看这个 recent discussion关于构建行数据关联数组的新变体方法 rangeToArray()。

关于PHPExcel 耗尽 256、512 和 1024MB RAM,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4817651/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com