gpt4 book ai didi

c# - 使用 Entity Framework 批量插入时内存意外增长

转载 作者:太空狗 更新时间:2023-10-29 21:44:58 25 4
gpt4 key购买 nike

我必须处理 100 万个实体才能构建事实。应该有大约相同数量的结果事实(100 万)。

我遇到的第一个问题是 Entity Framework 的批量插入速度很慢。所以我用了这个模式 Fastest Way of Inserting in Entity Framework (来自 SLauma 的回答)。现在我可以在一分钟内快速插入大约 100K 的实体。

我遇到的另一个问题是缺少处理所有内容的内存。所以我已经“分页”了处理过程。为避免内存不足异常,如果我从 100 万个结果事实中列出一个列表,我会得到。

我遇到的问题是内存总是随着分页而增长,我不明白为什么。在每批之后没有内存被释放。我认为这很奇怪,因为我在循环的每次迭代中获取侦察构建事实并将它们存储到数据库中。一旦循环完成,那些应该从内存中释放。但看起来不是,因为每次迭代后都没有释放内存。

在我深入挖掘之前,如果你发现了什么问题,能否请你告诉我?更具体地说,为什么在 while 循环迭代后没有释放内存。

static void Main(string[] args)
{
ReceiptsItemCodeAnalysisContext db = new ReceiptsItemCodeAnalysisContext();

var recon = db.Recons
.Where(r => r.Transacs.Where(t => t.ItemCodeDetails.Count > 0).Count() > 0)
.OrderBy( r => r.ReconNum);

// used for "paging" the processing
var processed = 0;
var total = recon.Count();
var batchSize = 1000; //100000;
var batch = 1;
var skip = 0;
var doBatch = true;

while (doBatch)
{ // list to store facts processed during the batch
List<ReconFact> facts = new List<ReconFact>();
// get the Recon items to process in this batch put them in a list
List<Recon> toProcess = recon.Skip(skip).Take(batchSize)
.Include(r => r.Transacs.Select(t => t.ItemCodeDetails))
.ToList();
// to process real fast
Parallel.ForEach(toProcess, r =>
{ // processing a recon and adding the facts to the list
var thisReconFacts = ReconFactGenerator.Generate(r);
thisReconFacts.ForEach(f => facts.Add(f));
Console.WriteLine(processed += 1);
});
// saving the facts using pattern provided by Slauma
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new System.TimeSpan(0, 15, 0)))
{
ReceiptsItemCodeAnalysisContext context = null;
try
{
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;

foreach (var fact in facts.Where(f => f != null))
{
count++;
Console.WriteLine(count);
context = ContextHelper.AddToContext(context, fact, count, 250, true); //context.AddToContext(context, fact, count, 250, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
Console.WriteLine("batch {0} finished continuing", batch);
// continuing the batch
batch++;
skip = batchSize * (batch - 1);
doBatch = skip < total;
// AFTER THIS facts AND toProcess SHOULD BE RESET
// BUT IT LOOKS LIKE THEY ARE NOT OR AT LEAST SOMETHING
// IS GROWING IN MEMORY
}
Console.WriteLine("Processing is done {} recons processed", processed);
}

Slauma提供的用 Entity Framework 优化批量插入的方法。

class ContextHelper
{
public static ReceiptsItemCodeAnalysisContext AddToContext(ReceiptsItemCodeAnalysisContext context,
ReconFact entity, int count, int commitCount, bool recreateContext)
{
context.Set<ReconFact>().Add(entity);

if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new ReceiptsItemCodeAnalysisContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
}

最佳答案

尝试告诉对象上下文不要跟踪对象,像这样:

static void Main(string[] args)
{
ReceiptsItemCodeAnalysisContext db = new ReceiptsItemCodeAnalysisContext();

var recon = db.Recons
.AsNoTracking() // <---- add this
.Where(r => r.Transacs.Where(t => t.ItemCodeDetails.Count > 0).Count() > 0)
.OrderBy( r => r.ReconNum);

//...

在您的代码中,所有一百万个 Recon 对象都将累积在内存中,直到对象上下文被释放。

关于c# - 使用 Entity Framework 批量插入时内存意外增长,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19636317/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com