Effect of lock keyword on a Parallel.ForEach loop(LOCK关键字对Parall.ForEach循环的影响)-6ren

Effect of lock keyword on a Parallel.ForEach loop(LOCK关键字对Parall.ForEach循环的影响)

转载作者：bug小助手更新时间：2023-10-24 19:49:12

This is more a conceptual question. I was wondering if I used a lock inside of Parallel.ForEach loop if that would take away the benefits of parallelizing a foreach loop.

这更多的是一个概念性问题。我想知道我是否在Parall.ForEach循环中使用了锁，这是否会剥夺并行化Foreach循环的好处。

Here is some sample code where I have seen it done.

以下是我所见过的一些示例代码。

Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
    lock (fileLockObject)
    {
        if (fileLocks.ContainsKey(reference.Key) == false)
        {
            fileLocks.Add(reference.Key, new object());
        }
    }

    RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount,
        ref recursionCount);

    lock (fileLocks[reference.Key])
    {
        reference.Value.Document.Save(reference.Key);
    }
});

Where fileLockObject and fileLocks are as follows.

其中，文件锁对象和文件锁如下所示。

private static object fileLockObject = new object();
private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();

Does this technique completely make the loop not parallel?
I would like to see your thoughts on this.

这项技术能让循环完全不平行吗？我想听听你对这件事的看法。

更多回答

优秀答案推荐

It means all of the work inside of the lock can't be done in parallel. This greatly harms the performance here, yes. Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data.

这意味着锁内的所有工作都不能并行完成。是的，这极大地损害了这里的表现。由于整个主体并不都是锁定的(并且锁定在同一个对象上)，因此这里仍然存在一些并行化。您确实获得的并行化是否增加了足够的好处，以超过管理线程和围绕锁进行同步所带来的开销，这是您真正需要用特定数据来测试自己的事情。

That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary. You can instead use a ConcurrentDictionary, which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done.

也就是说，看起来您正在做的事情(至少在第一个锁定块中，这是我更关心的，因为每个线程都锁定在同一个对象上)是锁定对字典的访问。相反，您可以使用一个专门设计用于从多个线程使用的ConCurentDictionary，它将最大限度地减少需要完成的同步量。

if I used a lock ... if that would take away the benefits of Paralleling a foreachloop.

Proportionally. When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. The locking part could be a less than 1%, or 99%. Look up Amdahls law, that applies here.

成比例地。当RecursiveBinUpdate()是一大块工作(并且是独立的)时，它仍然会得到回报。锁定部分可以小于1%或99%。查查Amdahls定律，它适用于这里。

But worse, your code is not thread-safe. From your 2 operations on fileLocks, only the first is actually inside a lock.

但更糟糕的是，您的代码不是线程安全的。从您对fileLock的2个操作中，只有第一个操作实际位于锁内。

lock (fileLockObject)
{
    if (fileLocks.ContainsKey(reference.Key) == false)
    {
       ...
    }
}

and

和

lock (fileLocks[reference.Key])   // this access to fileLocks[] is not protected

change the 2nd part to:

将第二部分更改为：

lock (fileLockObject)
{        
    reference.Value.Document.Save(reference.Key);
}

and the use of ref recursionCount as a parameter looks suspicious too. It might work with Interlocked.Increment though.

使用ref递归计数作为参数看起来也很可疑。它可能与Interlock一起工作。不过是增量。

When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (i.e. all the work in RecursiveBinUpdate).

当涉及到锁时，PLINQ/TPL线程必须等待才能获得访问权限的方式没有区别。因此，在您的情况下，它只会使循环在您锁定的那些区域中不并行，并且那些锁之外的任何工作仍然将并行执行(即，RecursiveBinUpdate中的所有工作)。

Bottom line, I see nothing substantially wrong with what you're doing here.

归根结底，我认为你在这里的所作所为并没有什么实质性的错误。

The "locked" portion of the loop will end up running serially. If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance.

循环的“锁定”部分将结束连续运行。如果RecursiveBinUpdate函数是大部分工作，可能会有一些收益，但如果您能提前弄清楚如何处理锁生成，那会更好。

There is nothing conceptually wrong with using the lock statement in a Parallel.ForEach loop. If done correctly, the effect of the lock on the overall performance of the parallel operation can be totally negligible. An uncontended lock is really cheap. It can be acquired and released in as little as 20 nanoseconds on a 2010-era computer, in other words in can be acquired and released 50,000,000 times per second (citation). On the other hand if the lock is used incorrectly, it can completely serialize the body delegate, negating all the benefits of parallelization. Or even worse, as in the code snippet presented in your question, it can be applied incorrectly and allow concurrent access on non-thread-safe state (fileLocks), resulting in undefined behavior.

在Parall.ForEach循环中使用LOCK语句在概念上没有任何错误。如果操作正确，锁对并行操作整体性能的影响可以完全忽略不计。无争用锁真的很便宜。在2010年代的计算机上，它可以在短短20纳秒内被获取和释放，换句话说，它可以被每秒获取和释放50,000,000次(引文)。另一方面，如果锁使用不当，它可能会完全序列化Body委托，从而抵消并行化的所有好处。或者更糟糕的是，如问题中提供的代码片段所示，它可能被不正确地应用，并允许对非线程安全状态(文件锁)进行并发访问，从而导致未定义的行为。

The general advice is: do the least amount of work while holding the lock, and release it at the earliest opportunity. Don't do work inside the lock that can be done outside the lock.

一般的建议是：在按住锁的同时做最少的工作，并在最早的机会释放它。不要在锁内做可以在锁外做的工作。

Regarding the specific problem that the code in the question is attempting to solve, which is to prevent parallel processing for reference objects having the same Key, locking on the Key might not be the optimal strategy. That's because it blocks ThreadPool threads, and also because it may result in reduced degree of parallelism. Your code does not specify the MaxDegreeOfParallelism, implying unlimited parallelism, which in practice means that the effective degree of parallelism will be determined by the ThreadPool availability, which is a bit chaotic¹. In case you change your mind and decide to specify the MaxDegreeOfParallelism, and you also desire to maintain a constant degree of parallelism during the whole operation, the simple use of the lock won't cut it. You want the parallelization devoted to doing actual work, not occupied by waits for a lock. You could look at this answer for a more sophisticated approach on this specific problem. That answer is about the asynchronous Parallel.ForEachAsync, but adjusting it to the synchronous Parallel.ForEach should be straightforward.

对于问题中的代码试图解决的特定问题，即防止对具有相同键的引用对象进行并行处理，锁定键可能不是最佳策略。这是因为它会阻塞ThreadPool线程，还因为它可能会降低并行度。您的代码没有指定MaxDegreeOfParallism，这意味着无限的并行度，这实际上意味着有效的并行度将由ThreadPool的可用性决定，这有点混乱？如果您改变主意并决定指定MaxDegreeOfParallism，并且还希望在整个操作期间保持恒定的并行度，那么简单地使用锁是不能解决问题的。您希望并行化专门用于执行实际工作，而不是被锁的等待占用。您可以查看此答案，以获得解决此特定问题的更复杂的方法。答案是关于异步Parall.ForEachAsync，但将其调整为同步Parall.ForEach应该很简单。

¹ _{My personal advice is to specify always the MaxDegreeOfParallelism when using the Parallel APIs.}

？我个人的建议是，在使用并行API时，始终指定MaxDegreeOfParallism。

更多回答

Your change to the second lock block is overly pessimistic. The intent of the code was so that only operations on the same file are synchronized, so that saving two different files can be done at the same time. Saving is a potentially very expensive operation, and so this is worth a bit of extra time/effort in managing the synchronization. Yes, the OP's code has a bug in it, and it should be fixed, but it can be solved by simply accessing the dictionary's value in the global lock instead (or, as my answer suggests, using a ConcurrentDictionary to handle the lookup).

您对第二个锁块的更改过于悲观。代码的目的是只同步对同一文件的操作，以便可以同时保存两个不同的文件。保存是一个潜在的非常昂贵的操作，因此值得在管理同步方面花费一些额外的时间/精力。是的，OP的代码中有一个bug，它应该被修复，但是它可以通过简单地访问全局锁中字典的值来解决（或者，正如我的回答所建议的，使用ConcurrentDictionary来处理查找）。

You're probably right that the saving can be pulled out of the lock, I didn't look thaat deeply. But there is a bug and as long as it's not fixed, performance ain't gonna matter.

你可能是对的，省下的钱可以从锁里拿出来，我没有看得太深。但有一个错误，只要它不被修复，性能就无关紧要。

@Servy the second lock would contradict your answer because you said that this would lock on the same object.

@Servy第二个锁将与您的答案相矛盾，因为您说这将锁定同一对象。

@diaz994 I fail to see what has contradicted my answer. Henk is proposing not locking on different objects and instead choosing to only lock on a single object.

@diaz994我看不出与我的答案相矛盾的是什么。亨克建议不要锁定不同的对象，而是选择只锁定单个对象。

Thanks for the response Reed. What do you mean handle the lock generation in advance? Im not really following there.

谢谢你的回复，里德。提前处理锁生成是什么意思？我并不是真的跟上了。

@diaz994 In your case, the initial lock could be avoided by creating the keys before you parallelize. The save could also be done after processing.

@diaz994在您的例子中，可以通过在并行化之前创建密钥来避免初始锁。也可以在处理后进行保存。

But wouldn't that cause to iterate through the references twice? That would cause some extra overhead that way. @Reed Copsey

但这不会导致对引用进行两次迭代吗？那样会产生一些额外的开销。@里德·科普西

@diaz994 iteration is typically quite fast - you might be able to use PLINQ to turn it into a 2 phase select (first builds the "locks", second does the processing), then do ForAll to save at the end...

@diaz994迭代通常相当快--您可以使用PLINQ将其转换为两个阶段的SELECT(首先构建“锁”，然后进行处理)，然后执行FORALL以在最后保存...

文章推荐： Threading time in quiz game(智力竞赛中的线程时间)

C++ 对象创建时没有使用 new 关键字，但在构造函数中使用了 new 关键字
如果我创建一个对象时没有使用 new 关键字，例如“Object s(someval)”，但该对象的构造函数使用了 new，当该对象超出范围时，是否会调用析构函数为其分配新的空间？我感觉好像是，但我不
Sql ONLY 关键字
在 SQL 语法中，我发现奇怪的规则表明 select * from ONLY (t1)是有效的 SQL。我的问题是:什么是 ONLY在这种情况下是什么意思？它在规范的“7.6 table ref
jQuery $(this) 关键字
为什么使用 $(this) 而不是重新选择类很重要？我在代码中使用了大量的动画和 CSS 编辑，并且我知道可以使用 $(this) 来简化它。最佳答案当您通过 jQuery 执行 DOM 查询(
Mysql IN 关键字
我正在尝试使用 IN 关键字编写查询。表A 属性标识、属性名称表B key 、属性标识、属性值根据提供的 key ，我想返回所有 attrName、attrVal 组合。结果将包含两个表中的列。
MySQL AS 关键字
这个问题在这里已经有了答案: Why would you use "AS" when aliasing a SQL table? (8 个答案) 关闭 9 年前。我不擅长写查询，但是从我开始使用
java this 关键字
我读过，在 Java 中，您不必将 this 关键字显式绑定(bind)到对象，它由解释器完成。它与 Javascript 相反，在 Javascript 中你总是必须知道 this 的值。但是 Ja
Swift "with"关键字
Swift 中“with”关键字的用途是什么？到目前为止，我发现如果您需要覆盖现有的全局函数，例如 toDebugString，可以使用该关键字。 // without "with" you
C# where 关键字
这个问题在这里已经有了答案: What does the keyword "where" in a class declaration do? (7 个答案) 关闭 9 年前。在下面的一段代码中(
Swift "where"关键字
免责声明:swift 菜鸟您好，我刚刚开始学习 Swift，正在学习 Swift 编程语言(Apple 在 WWDC 期间发布的书籍)，并且想知道“where”关键字是什么。它用于 let vege
去 "this"-关键字
深入研究文档后，我找不到以下问题的答案: 是否有任何理由反对使用 this 来引用当前对象，如下例所示？ type MyStruct struct { someField string } fun
PHP面向对象学习之parent::关键字
前言最近在做THINKPHP开发项目中，用到了 parent:: 关键字，实际上 parent::关键字是PHP中常要用到的一个功能，这不仅仅是在 THINKPHP 项目开发中，即使是一个小型
详谈signed 关键字
我们都知道且经常用到 unsigned 关键字，但有没有想过，与此对应的 signed 关键字有啥用？复制代码代码如下: int i = 0; signed
彻底理解Java中this 关键字
this关键字再java里面是一个我认为非常不好理解的概念，：）也许是太笨的原因 this 关键字的含义：可为以调用了其方法的那个对象生成相应的句柄。怎么理解这段话呢？ thinking i
初识 synchronized 关键字
一什么是 synchronized synchronized 关键字提供了一种锁机制，能够确保共享变量互斥访问，从而防止数据不一致问题的出现。 synchronized 关键字包括 monitor
深入解析 synchronized 关键字
最近看了几篇 synchronized 关键字的相关文章，收获很大，想着总结一下该关键字的相关内容。 1、synchronized 的作用原子性：所谓原子性就是指一个操作或者多个操作，要么全部执行并
JavaScript 方法和 this 关键字
在本教程中，您将借助示例了解 JavaScript 对象方法和 this 关键字。在 JavaScript 中，对象也可以包含函数。例如， // object containing meth
PHP "with"关键字 - "with"有什么作用？
有人可以解释一下 PHP“with”的作用吗？示例开始: 假设我有一个类: \App\fa_batch 这句话有什么区别: $w = (with (new \App\fa_batch))
typescript - 显式类型注释与 "as"关键字
这个问题在这里已经有了答案: What is the difference between using the colon and as syntax for declaring type? (2
tsql - IN 关键字与 OR 关键字
如果我在 WHERE 子句中使用以下任一项，是否会有很大不同: WHERE [Process Code] = 1 AND ([Material ID] = 'PLT' OR [Material ID]
sql - 关键字 'PROCEDURE'附近的语法不正确
This question is unlikely to help any future visitors; it is only relevant to a small geographic are

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Effect of lock keyword on a Parallel.ForEach loop(LOCK关键字对Parall.ForEach循环的影响)