gpt4 book ai didi

.net - 使用 vb.net 对大型 Excel 文件执行 SQL 查询的最佳方法是什么?

转载 作者:行者123 更新时间:2023-12-02 11:41:00 25 4
gpt4 key购买 nike

设置环境:

我正在使用 vb.net 和 .NET Framework 4 开发 Excel 2010 应用程序级插件。


我的目标:

  1. 让用户输入多个姓名进行搜索
  2. 使用名称列表在大型电子表格(30,000 多行)上执行 SQL 查询
  3. 返回记录集并粘贴到新工作表中

性能是我的首要任务。我想知道利用 .NET 框架实现此目的的最快方法。

在我的代码中使用 ADO 连接对象是可行的,但该过程花费的时间太长(5 - 8 秒)。


这是我在名为 wells 的表上使用的 SQL 查询:

    SELECT * 
FROM wells
WHERE padgroup in

(SELECT padgroup
FROM wells
WHERE name LIKE 'TOMCHUCK 21-30'
OR name LIKE 'FEDERAL 41-25PH')


以下是表格的一部分:

Excel Table


我现在正在使用此代码创建 ADO 连接对象来检索我的结果:

    'Create Recordset Object
rsCon = CreateObject("ADODB.Connection")
rsData = CreateObject("ADODB.Recordset")

rsCon.Open(szConnect)
rsData.Open(mySQLQueryToExecute, rsCon, 0, 1, 1)

'Check to make sure data is received, then copy the data
If Not rsData.EOF Then

TargetRange.Cells(1, 1).CopyFromRecordset(rsData)

Else

MsgBox("No records returned from : " & SourceFile, vbCritical)

End If

'Clean up the Recordset object
rsData.Close()
rsData = Nothing
rsCon.Close()
rsCon = Nothing


据我所知,Excel 电子表格以 Open XML 格式存储,并且 .NET 框架包含对解析 XML 的 native 支持。

经过研究,我发现了一些不同的选择:


有人可以提供有关最佳使用方法的指示吗?我真的很感激。


附加说明:

  • 所有查询都需要能够在没有连接的情况下执行在线数据库
  • 我只需访问电子表格一次即可从行中提取原始数据


现在我只是将电子表格作为项目资源嵌入。

然后,在运行时我创建文件,运行查询,将结果存储在内存中,然后删除文件。

   'Create temp file path in the commonapplicationdata folder
Dim excelsheetpath As StringBuilder

excelsheetpath = New StringBuilder(Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData))

excelsheetpath.Append("\MasterList.xlsm")

'Save resources into temp location in HD
System.IO.File.WriteAllBytes(excelsheetpath.ToString, My.Resources.MasterList)

'Now call the function to use ADO to get records from the MasterList.xlsm file just created
GetData(excelsheetpath.ToString, "Sheet1", "A1:S40000", True, False)

'Store the results in-memory and display by adding to a datagridview control (in a custom task pane)

'Delete the spreadsheet
System.IO.File.Delete(excelsheetpath.ToString())

最佳答案

您使用 VSTO 的方式是错误的;)不要在 Excel 中使用 SQL。如果您需要速度,请利用 VSTO 和 native Excel API。您可以跳过 ADODB/OLEDB 层的开销,直接进入 Excel 对象模型,使用 Excel 中超快的自动筛选功能,即 SpecialCells 方法,仅将可见单元格放入多区域范围中,以及用于快速将范围复制到数组的 Value 方法。

这是一个示例 VSTO 2010 自定义工作簿,可快速搜索 a list of 58k words for words包含“aba”、“cat”或“zon”。

using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Xml.Linq;
using Microsoft.Office.Tools.Excel;
using Microsoft.VisualStudio.Tools.Applications.Runtime;
using Excel = Microsoft.Office.Interop.Excel;
using Office = Microsoft.Office.Core;

namespace ExcelWorkbook1
{
public partial class ThisWorkbook
{
private void ThisWorkbook_Startup(object sender, System.EventArgs e)
{
const int Sheet1 = 1; // you can use Linq to find a sheet by name if needed
const int ColumnB = 2;
List<List<object>> results = Query(Sheet1, ColumnB, "aba", "cat", "zon");

foreach (List<object> record in results)
{
System.Diagnostics.Debug.Print("{0,-10} {1,30} {2}", record[0], record[1], record[2]);
}
}

private void ThisWorkbook_Shutdown(object sender, System.EventArgs e)
{
}

/// <summary>
/// Removes any existing Excel autofilters from the worksheet
/// </summary>
private void ClearFilter(Microsoft.Office.Interop.Excel._Worksheet worksheet)
{
if (worksheet.AutoFilter != null)
{
worksheet.Cells.AutoFilter();
}
}

/// <summary>
/// Applies an Excel Autofilter to the worksheet for search for an array of substring predicates
/// </summary>
private void ApplyFilter(Microsoft.Office.Interop.Excel._Worksheet worksheet, int column, params string[] predicates)
{
string[] criteria = new string[predicates.Length];
int i = 0;

ClearFilter(worksheet);

foreach (string value in predicates)
{
criteria[i++] = String.Concat("=*", value, "*");
}

worksheet.Cells.AutoFilter(column, criteria, Excel.XlAutoFilterOperator.xlOr);
}

/// <summary>
/// Returns a list of rows that are hits on a search for an array of substrings in Column B of Sheet1
/// </summary>
private List<List<object>> Query(int sheetIndex, int columnIndex, params string[] words)
{
Microsoft.Office.Interop.Excel._Worksheet worksheet;
Excel.Range range;
List<List<object>> records = new List<List<object>>();
List<object> record;
object[,] cells;
object value;
int row, column, rows, columns;
bool hit;

try
{
worksheet = (Microsoft.Office.Interop.Excel._Worksheet)Globals.ThisWorkbook.Sheets[sheetIndex];
if (null == worksheet)
{
return null;
}

// apply the autofilter
ApplyFilter(worksheet, columnIndex, words);

// get the
range = worksheet.Range["$A:$C"].SpecialCells(Excel.XlCellType.xlCellTypeVisible);
foreach (Excel.Range subrange in range.Areas)
{
// copy the cells to a multidimensional array for perfomance
cells = subrange.Value;

// transform the multidimensional array to a List
for (row = cells.GetLowerBound(0), rows = cells.GetUpperBound(0); row <= rows; row++)
{
record = new List<object>();
hit = false;

for (column = cells.GetLowerBound(1), columns = cells.GetUpperBound(1); column <= columns; column++)
{
value = cells[row, column];
hit = hit || (null != value);

if (hit)
{
record.Add(cells[row, column]);
}
}

if (hit)
{
records.Add(record);
}
}
}
}
catch { }
finally
{
// use GC.Collect over Marshal.ReleaseComObject() to release all RCWs per http://stackoverflow.com/a/17131389/1995977 and more
cells = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}

return records;
}

#region VSTO Designer generated code

/// <summary>
/// Required method for Designer support - do not modify
/// the contents of this method with the code editor.
/// </summary>
private void InternalStartup()
{
this.Startup += new System.EventHandler(ThisWorkbook_Startup);
this.Shutdown += new System.EventHandler(ThisWorkbook_Shutdown);
}

#endregion

}
}

关于.net - 使用 vb.net 对大型 Excel 文件执行 SQL 查询的最佳方法是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22409637/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com