gpt4 book ai didi

azure-service-fabric - Service Fabric 有状态服务因 FailFast 而失败

转载 作者:行者123 更新时间:2023-12-04 08:19:59 24 4
gpt4 key购买 nike

我在 Azure 集群中运行 Service Fabric 应用程序。该应用程序可以正常运行几天,没有任何问题。然而昨天我们注意到其中一项服务进入了“错误”状态并且此后一直没有恢复。

该服务是一种可靠的有状态服务,具有 1 个分区和 3 个副本(即一个主副本,两个辅助副本)。该服务实现长时间运行的 RunAsync(),并在 RunAsync() 期间同时读取和写入许多 ReliableDictionary 条目。

我们注意到主节点上的事件日志中存在以下错误:

Description: The application requested process termination through System.Environment.FailFast(string message).
Message: ProgressVectorEntry.Lsn == failureLsn
Stack:
at System.Environment.FailFast(System.String)
at Microsoft.ServiceFabric.Replicator.ProgressVector.FindSharedVector(Microsoft.ServiceFabric.Replicator.ProgressVector, Microsoft.ServiceFabric.Replicator.ProgressVector)
at Microsoft.ServiceFabric.Replicator.ProgressVector.FindCopyModePrivate(Microsoft.ServiceFabric.Replicator.CopyContextParameters, Microsoft.ServiceFabric.Replicator.CopyContextParameters, Int64)
at Microsoft.ServiceFabric.Replicator.ProgressVector.FindCopyMode(Microsoft.ServiceFabric.Replicator.CopyContextParameters, Microsoft.ServiceFabric.Replicator.CopyContextParameters, Int64)
at Microsoft.ServiceFabric.Replicator.LoggingReplicator.GetLogRecordsToCopy(Microsoft.ServiceFabric.Replicator.ProgressVector, System.Fabric.Epoch, Microsoft.ServiceFabric.Replicator.LogicalSequenceNumber, Microsoft.ServiceFabric.Replicator.LogicalSequenceNumber, Int64, Int64, Microsoft.ServiceFabric.Replicator.LogicalSequenceNumber ByRef, Microsoft.ServiceFabric.Replicator.LogicalSequenceNumber ByRef, Microsoft.ServiceFabric.Data.IAsyncEnumerator`1<Microsoft.ServiceFabric.Replicator.LogRecord> ByRef, Microsoft.ServiceFabric.Replicator.BeginCheckpointLogRecord ByRef)
at Microsoft.ServiceFabric.Replicator.LoggingReplicatorCopyStream+<GetNextAsyncSafe>d__3.MoveNext()
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Start[[Microsoft.ServiceFabric.Replicator.LoggingReplicatorCopyStream+<GetNextAsyncSafe>d__3, Microsoft.ServiceFabric.Data.Impl, Version=5.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<GetNextAsyncSafe>d__3 ByRef)
at Microsoft.ServiceFabric.Replicator.LoggingReplicatorCopyStream.GetNextAsyncSafe(System.Threading.CancellationToken)
at Microsoft.ServiceFabric.Replicator.LoggingReplicatorCopyStream.GetNextAsync(System.Threading.CancellationToken)
at System.Fabric.StateProviderBroker+AsyncEnumerateOperationDataBroker.<BeginGetNext>b__8(System.Threading.CancellationToken)
at System.Fabric.Interop.Utility.WrapNativeAsyncMethodImplementation(System.Func`2<System.Threading.CancellationToken,System.Threading.Tasks.Task>, IFabricAsyncOperationCallback, System.String, System.Fabric.Interop.InteropApi)
at System.Fabric.StateProviderBroker+AsyncEnumerateOperationDataBroker.BeginGetNext(IFabricAsyncOperationCallback)

我们从未在本地开发环境中观察到这种情况,到目前为止,我们只在 Azure 集群中观察到这种情况。

  1. 这里发生了什么?它看起来像损坏的复制信息。
  2. 我们的代码在做什么会导致这个问题?

最佳答案

这是 Service Fabric 中的一个问题,已在 5.3.311 中修复.

关于azure-service-fabric - Service Fabric 有状态服务因 FailFast 而失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40376524/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com