gpt4 book ai didi

php - 使用 PHPExcel 处理 .xls 获取 OOM

转载 作者:行者123 更新时间:2023-12-04 20:35:34 25 4
gpt4 key购买 nike

我知道 How to read large worksheets from large Excel files (27MB+) with PHPExcel?我已经尝试实现该问题中讨论的分 block 阅读,但是我仍然遇到 OOM 错误。文件本身不到 5Mb,有 9000 多行(是的,超过 9000 行!),范围从 A 到 V。

我不想让用户在上传和处理它之前对这个文件进行任何编辑,因为目前这都是一个手动过程,我想用一个自动化的过程来完全替换它。该文件为 xls 格式,通过 PHPExcel 识别为 Excel5。

我的 PHP 内存限制当前设置为 128M,在 Ubuntu Server 上运行。

无论我设置什么 block 大小,我最终都会 OOM'ing。对于较大的 block 大小,如果我将 block 大小设置为 200,它实际上运行得更好(因为我可以管理大约 7000 行),当设置为 1 时,它 OOM 在第 370 行左右。所以我相信正在存储“某些东西”,或在 block 读取的每次迭代中加载到内存中,然后不再丢弃,最终导致 OOM,但我看不到发生这种情况的位置。

我是一个非常业余的程序员,这只是我在工作中作为托管服务角色所做的事情,试图让我们的生活更轻松。

这段代码的重点是读取 excel 文件,过滤掉“废话”,然后将其保存为 CSV(现在我只是将其转储到屏幕而不是 CSV)。按照事情的发展速度,我很想通过 php 脚本调用 excel2csv,然后尝试清理 CSV……但是当我可能相当接近解决方案时,这感觉就像放弃了。

<?php

error_reporting(E_ALL);
set_time_limit(0);
date_default_timezone_set('Europe/London');

require_once 'Classes/PHPExcel/IOFactory.php';

class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
private $_startRow = 0;
private $_endRow = 0;
private $_columns = array();

/** Set the list of rows that we want to read */
public function setRows($startRow, $chunkSize, $columns) {
$this->_startRow = $startRow;
$this->_endRow = $startRow + $chunkSize;
$this->_columns = $columns;
}
public function readCell($column, $row, $worksheetName = '') {
// Only read the heading row, and the rows that are configured in $this->_startRow$
if ($row >= $this->_startRow && $row < $this->_endRow) {
if(in_array($column,$this->_columns)) {
return true;
}
}
return false;
}
}
$target_dir = "uploads/";
$file_name = $_POST["file_name"];

$full_path = $target_dir . $file_name;

echo "Processing ". $file_name . '; <br>';

ob_flush();
flush();


/** /** As files maybe large in memory, use a temp file to handle them
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '8MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
**/

$inputFileName = $full_path;

echo 'Excel reader started<br/>';

/** First we should get the type of file **/

$filetype = PHPExcel_IOFactory::identify($inputFileName);

echo 'File of type: ' . $filetype . ' found<br/>';

/** Load $inputFileName to a PHPExcel Object - https://github.com/PHPOffice/PHPExcel/blob/develop/$


/** Define how many rows we want to read for each "chunk" **/
$chunkSize = 1;
/** Create a new Instance of our Read Filter **/
$chunkFilter = new chunkReadFilter();

$objReader = PHPExcel_IOFactory::createReader($filetype);

/** Tell the Reader that we want to use the Read Filter that we've Instantiated **/
$objReader->setReadFilter($chunkFilter);
/** Loop to read our worksheet in "chunk size" blocks **/
for ($startRow = 2; $startRow <= 65000; $startRow += $chunkSize) {
$endRow = $startRow+$chunkSize-1;
echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startR$
/** Tell the Read Filter, the limits on which rows we want to read this iteration **/
$chunkFilter->setRows($startRow,$chunkSize,range('A','T'));
/** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
// Do some processing here
// $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
$sheetData = $objPHPExcel->getActiveSheet()->rangeToArray("A$startRow:T$endRow");
var_dump($sheetData);
// Clear the variable to not go over memory!
$objPHPExcel->disconnectWorksheets();
unset ($sheetData);
unset ($objPHPExcel);
ob_flush();
flush();

echo '<br /><br />';
}


/** This loads the entire file, crashing with OOM

try {
$objPHPExcel = PHPExcel_IOFactory::load($inputFileName);
echo 'loaded sheet into memory<br>';
} catch(PHPExcel_Reader_Exception $e) {
die('Error loading file: '.$e->getMessage());
}

$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV');

echo 'Saving sheet as CSV<br>';

$objWriter->setSheetIndex(0);
$objWriter->save('./uploads/'.$file_name.'.csv');
echo 'Processed 1 sheet';
ob_flush();
flush();

**/

echo "<body><table>\n\n";


/**
$f = fopen($file_name, "r");
while (($line = fgetcsv($f)) !== false) {
echo "<tr>";
foreach ($line as $cell) {
echo "<td>" . htmlspecialchars($cell) . "</td>";
}
echo "</tr>\n";
}
fclose($f);
**/

echo "\n</table></body></html>";

?>

apache日志中指示的错误是:
[Fri Mar 31 15:35:27.982697 2017] [:error] [pid 1059] [client 10.0.2.2:53866] PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 45056 bytes) in /var/www/html/Classes/PHPExcel/Shared/OLERead.php on line 93, referer: http://localhost:8080/upload.php

最佳答案

unset ($objPHPExcel);

如果您查看 PHPExcel documentation ,由于电子表格、工作表和单元格之间的循环引用,这不会完全取消设置 $objPHPExcel,并且会导致内存泄漏。建议首先断开这些循环引用。
$objPHPExcel->disconnectWorksheets();
unset($objPHPExcel);

仍然会有一些内存泄漏,但它应该允许在 block 之间释放更多内存

关于php - 使用 PHPExcel 处理 .xls 获取 OOM,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43142945/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com