gpt4 book ai didi

C# : Forcing a clean run in a long running SQL reader loop?

转载 作者:行者123 更新时间:2023-11-30 22:45:45 26 4
gpt4 key购买 nike

我有一个 SQL 数据读取器,它从 sql 数据库表中读取 2 列。一旦它完成了它的工作,它就会再次开始选择另外 2 列。

我会一次性完成所有工作,但这会带来一系列其他挑战。

我的问题是该表包含大量数据(大约 300 万行左右),这使得处理整个数据集有点困难。

我正在尝试验证字段值,因此我先拉出 ID 列,然后拉出其他列之一,并通过验证管道运行列中的每个值,结果存储在另一个数据库中。

我的问题是,当读者到达处理一列的末尾时,我需要强制它立即清理使用的每一小块 ram,因为这个过程使用大约 700MB 并且它有大约 200 列要处理。

如果没有完整的垃圾收集,我肯定会用完 ram。

有人知道我该怎么做吗?

我正在使用大量可重复使用的小对象,我的想法是我可以在每个读取周期结束时调用 GC.Collect() 并清除所有内容,不幸的是由于某种原因没有发生。

好的,我希望这适合,但这是有问题的方法......

void AnalyseTable(string ObjectName, string TableName)
{
Console.WriteLine("Initialising analysis process for SF object \"" + ObjectName + "\"");
Console.WriteLine(" The data being used is in table [" + TableName + "]");
// get some helpful stuff from the databases
SQLcols = Target.GetData("SELECT Column_Name, Is_Nullable, Data_Type, Character_Maximum_Length FROM information_schema.columns WHERE table_name = '" + TableName + "'");
SFcols = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "Fields]");
PickLists = SchemaSource.GetData("SELECT * FROM [" + ObjectName + "PickLists]");

// get the table definition
DataTable resultBatch = new DataTable();
resultBatch.TableName = TableName;
int counter = 0;

foreach (DataRow Column in SQLcols.Rows)
{
if (Column["Column_Name"].ToString().ToLower() != "id")
resultBatch.Columns.Add(new DataColumn(Column["Column_Name"].ToString(), typeof(bool)));
else
resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
}
// create the validation results table
//SchemaSource.CreateTable(resultBatch, "ValidationResults_");
// cache the id's from the source table in the validation table
//CacheIDColumn(TableName);

// validate the source table
// iterate through each sql column
foreach (DataRow Column in SQLcols.Rows)
{
// we do this here to save making this call a lot more later
string colName = Column["Column_Name"].ToString().ToLower();
// id col is only used to identify records not in validation
if (colName != "id")
{
// prepare to process
counter = 0;
resultBatch.Rows.Clear();
resultBatch.Columns.Clear();
resultBatch.Columns.Add(new DataColumn("ID", typeof(string)));
resultBatch.Columns.Add(new DataColumn(colName, typeof(bool)));

// identify matching SF col
foreach (DataRow SFDefinition in SFcols.Rows)
{
// case insensitive compare on the col name to ensure we have a match ...
if (SFDefinition["Name"].ToString().ToLower() == colName)
{
// select the id column and the column data to validate (current column data)
using (SqlCommand com = new SqlCommand("SELECT ID, [" + colName + "] FROM [" + TableName + "]", new SqlConnection(ConfigurationManager.ConnectionStrings["AnalysisTarget"].ConnectionString)))
{
com.Connection.Open();
SqlDataReader reader = com.ExecuteReader();

Console.WriteLine(" Validating column \"" + colName + "\"");
// foreach row in the given object dataset
while (reader.Read())
{
// create a new validation result row
DataRow result = resultBatch.NewRow();
bool hasFailed = false;
// validate it
object vResult = ValidateFieldValue(SFDefinition, reader[Column["Column_Name"].ToString()]);
// if we have the relevant col definition lets decide how to validate this value ...
result[colName] = vResult;

if (vResult is bool)
{
// if it's deemed to have failed validation mark it as such
if (!(bool)vResult)
hasFailed = true;
}

// no point in adding rows we can't trace
if (reader["id"] != DBNull.Value && reader["id"] != null)
{
// add the failed row to the result set
if (hasFailed)
{
result["id"] = reader["id"];
resultBatch.Rows.Add(result);
}
}

// submit to db in batches of 200
if (resultBatch.Rows.Count > 199)
{
counter += resultBatch.Rows.Count;
Console.Write(" Result batch completed,");
SchemaSource.Update(resultBatch, "ValidationResults_");
Console.WriteLine(" committed " + counter.ToString() + " fails to the database so far.");
Console.SetCursorPosition(0, Console.CursorTop-1);
resultBatch.Rows.Clear();
}
}
// get rid of these likely very heavy objects
reader.Close();
reader.Dispose();
com.Connection.Close();
com.Dispose();
// ensure .Net does a full cleanup because we will need the resources.
GC.Collect();

if (resultBatch.Rows.Count > 0)
{
counter += resultBatch.Rows.Count;
Console.WriteLine(" All batches for column complete,");
SchemaSource.Update(resultBatch, "ValidationResults_");
Console.WriteLine(" committed " + counter.ToString() + " fails to the database.");
}
}
}
}
}

Console.WriteLine(" Completed processing column \"" + colName + "\"");
Console.WriteLine("");
}

Console.WriteLine("Object processing complete.");
}

最佳答案

你能发布一些代码吗? .NET 的数据读取器应该是一个在 RAM 上吝啬的“消防水带”,除非像弗雷迪建议的那样,您的列数据值很大。这个验证+数据库写入需要多长时间?

一般情况下,如果需要GC并且可以做,就会做。我可能听起来像是一张破唱片,但如果你必须使用 GC.Collect(),那就有其他问题了。

关于C# : Forcing a clean run in a long running SQL reader loop?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2914811/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com