gpt4 book ai didi

c# - Azure Kubernetes .NET Core 应用程序到 Azure SQL 数据库间歇性错误 258

转载 作者:行者123 更新时间:2023-12-04 12:17:58 26 4
gpt4 key购买 nike

我们正在 Kubernetes 集群中运行 .NET Core 3.1 应用程序。该应用程序使用 EF Core 3.1.7 和 Microsoft.Data.SqlClient 1.1.3 连接到 Azure SQL 数据库。

在看似随机的时间,我们会收到以下错误。

 ---> System.Data.SqlClient.SqlException (0x80131904): Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): Unknown error 258
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
at System.Data.SqlClient.SqlDataReader.get_MetaData()
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds)
at System.Data.SqlClient.SqlCommand.ExecuteScalar()

尽管它看起来是随机的,但在较重的负载下它肯定会更频繁地发生。根据我的研究,这个特定超时似乎与连接超时有关,而不是与命令超时有关。 IE。客户端根本无法建立连接。这不是一个超时的查询。

我们已消除的潜在根本原因:

  • Azure SQL Server 容量:无论我们在 4 个还是 16 个 vCPU 上运行,都会观察到该行为。 Azure 支持还确认日志中没有问题。这包括打开的连接数量,该数量仅为 50 个左右。我们还从其他连接运行了负载测试,服务器运行良好。
  • Microsoft.Data.SqlClient 版本:我们一直在版本 1.1.3 上运行,此行为仅在一周前 (2021-03-16) 开始。
  • 网络容量:现阶段我们的最大速度约为 1-2MB/s,这相当缓慢。
  • Kubernetes 扩展:事件的发生与我们何时扩展更多 Pod 之间没有关联。
  • 连接字符串问题:我们的系统过去运行良好,但无论如何,我们更改了其他文章中提到的一些设置,看看问题是否无法自行解决。火星已禁用。我们无法禁用连接池。我们将 TrusServerCertificate 设置为 true。这是当前的连接字符串:Server=tcp:***.database.windows.net,1433;Initial Catalog=***;Persist Security Info=False;User ID=***;Password=** *;MultipleActiveResultSets=False;Encrypt=True;连接超时=60;TrustServerCertificate=True;

更新 1:根据要求,提供了刚刚发生的两次超时的示例。由于是周日,所以客流量非常少。数据库利用率(CPU、Mem、IO)介于 2-6% 之间。

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): Unknown error 258
at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
at Microsoft.Data.SqlClient.SqlInternalConnection.BeginSqlTransaction(IsolationLevel iso, String transactionName, Boolean shouldReconnect)
at Microsoft.Data.SqlClient.SqlConnection.BeginTransaction(IsolationLevel iso, String transactionName)
at Microsoft.Data.SqlClient.SqlConnection.BeginDbTransaction(IsolationLevel isolationLevel)
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.BeginTransaction(IsolationLevel isolationLevel)
at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)

使用此命令时,我们的数据库运行状况检查器也收到错误:Microsoft.EntityFrameworkCore.Infrastruct.DatabaseFacade.CanConnect()

上面的堆栈跟踪是我们试图解决的问题,而下面的堆栈跟踪是 SQL 查询超时的问题。

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): Unknown error 258
at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
at Microsoft.Data.SqlClient.SqlDataReader.get_MetaData()
at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted)
at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method)
at Microsoft.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior)

最佳答案

问题是 Azure 的基础设施问题。 .

There is a known issue within Azure Network where the dhcp lease islost whenever a disk attach/detach happens on some VM fleets. There isa fix rolling out at the moment to regions. I'll check to see whenAzure Status update will be published for this.

问题消失了,因此修复似乎已在全局范围内推出。

对于将来遇到此问题的其他人,您可以通过建立 SSH connection into the node 来识别它。 (不是 Pod)。执行 ls -al/var/log/ 并识别所有 syslog 文件,并对每个文件运行以下 grep。

cat /var/log/syslog | grep 'carrier'

如果日志中存在任何失去运营商获得运营商 消息,则存在某种网络问题。在我们的例子中,它是 DHCP 租约。

enter image description here

关于c# - Azure Kubernetes .NET Core 应用程序到 Azure SQL 数据库间歇性错误 258,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66721744/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com