gpt4 book ai didi

.net - 从 XLSX 导出大量数据 - OutOfMemoryException

转载 作者:行者123 更新时间:2023-12-04 09:30:40 25 4
gpt4 key购买 nike

我即将以 Excel OpenXML 格式 (xlsx) 导出大量数据(115.000 行 x 30 列)。
我正在使用一些库,如 DocumentFormat.OpenXML、ClosedXML、NPOI。

每次执行此操作时,都会抛出 OutOfMemoryException 异常,因为内存中的工作表表示会导致内存呈指数增长。

同样每 1000 行关闭一次文档文件(并释放内存),下次加载会导致内存增加。

有没有更高效的方式在不占用大量内存的情况下导出 xlsx 中的数据?

最佳答案

OpenXML SDK 是完成这项工作的正确工具,但您需要小心使用 SAX (用于 XML 的简单 API)方法而不是 DOM方法。来自 SAX 链接的维基百科文章:

Where the DOM operates on the document as a whole, SAX parsers operate on each piece of the XML document sequentially



这大大减少了处理大型 Excel 文件时消耗的内存量。

这里有一篇很好的文章 - http://polymathprogrammer.com/2012/08/06/how-to-properly-use-openxmlwriter-to-write-large-excel-files/

改编自那篇文章,这是一个输出 115k 行 30 列的示例:
public static void LargeExport(string filename)
{
using (SpreadsheetDocument document = SpreadsheetDocument.Create(filename, SpreadsheetDocumentType.Workbook))
{
//this list of attributes will be used when writing a start element
List<OpenXmlAttribute> attributes;
OpenXmlWriter writer;

document.AddWorkbookPart();
WorksheetPart workSheetPart = document.WorkbookPart.AddNewPart<WorksheetPart>();

writer = OpenXmlWriter.Create(workSheetPart);
writer.WriteStartElement(new Worksheet());
writer.WriteStartElement(new SheetData());

for (int rowNum = 1; rowNum <= 115000; ++rowNum)
{
//create a new list of attributes
attributes = new List<OpenXmlAttribute>();
// add the row index attribute to the list
attributes.Add(new OpenXmlAttribute("r", null, rowNum.ToString()));

//write the row start element with the row index attribute
writer.WriteStartElement(new Row(), attributes);

for (int columnNum = 1; columnNum <= 30; ++columnNum)
{
//reset the list of attributes
attributes = new List<OpenXmlAttribute>();
// add data type attribute - in this case inline string (you might want to look at the shared strings table)
attributes.Add(new OpenXmlAttribute("t", null, "str"));
//add the cell reference attribute
attributes.Add(new OpenXmlAttribute("r", "", string.Format("{0}{1}", GetColumnName(columnNum), rowNum)));

//write the cell start element with the type and reference attributes
writer.WriteStartElement(new Cell(), attributes);
//write the cell value
writer.WriteElement(new CellValue(string.Format("This is Row {0}, Cell {1}", rowNum, columnNum)));

// write the end cell element
writer.WriteEndElement();
}

// write the end row element
writer.WriteEndElement();
}

// write the end SheetData element
writer.WriteEndElement();
// write the end Worksheet element
writer.WriteEndElement();
writer.Close();

writer = OpenXmlWriter.Create(document.WorkbookPart);
writer.WriteStartElement(new Workbook());
writer.WriteStartElement(new Sheets());

writer.WriteElement(new Sheet()
{
Name = "Large Sheet",
SheetId = 1,
Id = document.WorkbookPart.GetIdOfPart(workSheetPart)
});

// End Sheets
writer.WriteEndElement();
// End Workbook
writer.WriteEndElement();

writer.Close();

document.Close();
}
}

//A simple helper to get the column name from the column index. This is not well tested!
private static string GetColumnName(int columnIndex)
{
int dividend = columnIndex;
string columnName = String.Empty;
int modifier;

while (dividend > 0)
{
modifier = (dividend - 1) % 26;
columnName = Convert.ToChar(65 + modifier).ToString() + columnName;
dividend = (int)((dividend - modifier) / 26);
}

return columnName;
}

关于.net - 从 XLSX 导出大量数据 - OutOfMemoryException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32690851/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com