gpt4 book ai didi

f# - 为什么带有计算表达式的 PSeq.map 似乎挂起?

转载 作者:行者123 更新时间:2023-12-01 13:36:54 25 4
gpt4 key购买 nike

我正在使用 FSharp.Collections.ParallelSeq 编写一个爬虫和一个retry computation 。我想并行地从多个页面检索 HTML,并且希望在请求失败时重试请求。

例如:

open System
open FSharp.Collections.ParallelSeq

type RetryBuilder(max) =
member x.Return(a) = a // Enable 'return'
member x.Delay(f) = f // Gets wrapped body and returns it (as it is)
// so that the body is passed to 'Run'
member x.Zero() = failwith "Zero" // Support if .. then
member x.Run(f) = // Gets function created by 'Delay'
let rec loop(n) =
if n = 0 then failwith "Failed" // Number of retries exceeded
else try f() with _ -> loop(n-1)
loop max

let retry = RetryBuilder(4)

let getHtml (url : string) = retry {
Console.WriteLine("Get Url")
return 0;
}

//A property/field?
let GetHtmlForAllPages =
let pages = {1 .. 10}
let allHtml = pages |> PSeq.map(fun x -> getHtml("http://somesite.com/" + x.ToString())) |> Seq.toArray
allHtml

[<EntryPoint>]
let main argv =
let htmlForAllPages = GetHtmlForAllPages
0 // return an integer exit code

当我尝试从 mainGetHtmlForAllPages 交互时,代码似乎挂起。单步执行代码显示 PSeq.map 开始处理 pages 的前四个值。

发生了什么导致重试计算表达式永远无法开始/完成? PSeqretry 之间是否存在一些奇怪的相互作用?

如果我将 GetHtmlForAllPages 设为函数并调用它,代码将按预期工作。我很好奇当 GetHtmlForAllPages 是一个字段时会发生什么?

最佳答案

看起来您在静态构造函数中陷入了僵局。该场景描述为 here :

The CLR uses an internal lock to ensure that static constructor:

  • is only called once
  • gets executed before creation of any instance of theclass or before accessing any static members.

With this behaviour ofCLR, there is a potential opportunity of a deadlock if we perform anyasynchronous blocking operation in a static constructor. (...)

The main thread will wait for the helper thread to complete within thestatic constructor. Since the helper thread is accessing the instancemethod, it will first try to acquire the internal lock. As internallock is already acquired by the main thread, we will end-up in adeadlock situation.

在静态构造函数中使用 Parallel LINQ(或任何其他类似的库,如 FSharp.Collections.ParallelSeq)将使您遇到该问题。

不幸的是,您从 GetHtmlForAllPages 值中获得的是编译器生成的类的静态构造函数。来自 ILSpy(使用 C# 格式):

namespace <StartupCode$ConsoleApplication1>
{
internal static class $Program
{
[DebuggerBrowsable(DebuggerBrowsableState.Never)]
internal static readonly Program.RetryBuilder retry@17;

[DebuggerBrowsable(DebuggerBrowsableState.Never)]
internal static readonly int[] GetHtmlForAllPages@24;

[DebuggerBrowsable(DebuggerBrowsableState.Never), DebuggerNonUserCode, CompilerGenerated]
internal static int init@;

static $Program()
{
$Program.retry@17 = new Program.RetryBuilder(4);
IEnumerable<int> pages = Operators.OperatorIntrinsics.RangeInt32(1, 1, 10);
ParallelQuery<int> parallelQuery = PSeqModule.map<int, int>(new Program.allHtml@26(), pages);
ParallelQuery<int> parallelQuery2 = parallelQuery;
int[] allHtml = SeqModule.ToArray<int>((IEnumerable<int>)parallelQuery2);
$Program.GetHtmlForAllPages@24 = allHtml;
}
}
}

在您实际的Program类中:

[CompilationMapping(SourceConstructFlags.Value)]
public static int[] GetHtmlForAllPages
{
get
{
return $Program.GetHtmlForAllPages@24;
}
}

这就是僵局的由来。

一旦将 GetHtmlForAllPages 更改为函数(通过添加 ()),它就不再是该静态构造函数的一部分,这使得程序按预期工作.

关于f# - 为什么带有计算表达式的 PSeq.map 似乎挂起?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43087426/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com