A recently deployed and fully functional Azure Function app is deleted, reported absent, but in a couple of minutes it shows up again in the resource group. What causes its resurrection and how to avoid it?
一个最近部署的功能齐全的Azure功能应用被删除,报告不存在,但几分钟后它又出现在资源组中。是什么导致了它的复活,如何避免它?
Here are the details.
以下是详细信息。
An automated integration test has 3 steps.
自动化集成测试有3个步骤。
- Deployment of ~10 resources in an existing empty azure resource group. The deployment includes a function app in consumption plan.
- Functional test of the system that takes ~10 mins.
- Deletion of all resources in the resource group.
In the last deletion step the script first deletes the function app, checks that the resource is no longer reported in the resource group and then lists and deletes all other resources in parallel. Finally it checks that the resource group is empty.
在最后的删除步骤中,脚本首先删除函数app,检查资源组中是否不再报告该资源,然后列出并并行删除所有其他资源。最后,它检查资源组是否为空。
Surprisingly, the last check often fails!
令人惊讶的是,最后一次检查经常失败!
Even more details.
更多细节。
I'm deleting all resources with the same code that checks the resource is absent after the delete operation:
我正在删除所有资源,其代码与在删除操作后检查资源是否缺失的代码相同:
Remove-AzResource -ResourceId $ResourceId -Force | Out-Null
while (Get-AzResource -ResourceId $ResourceId -ErrorAction SilentlyContinue){
Write-Information "Retry: $resourceId"
Start-Sleep -Seconds 10
Remove-AzResource -ResourceId $ResourceId -Force | Out-Null
}
$ResourceId
And here is a sample output:
下面是一个输出示例:
[09/03/2023 22:46:24] Deleting ALL resources in resource group *** in ***...
[09/03/2023 22:46:28] Deleting web apps exekiascanaryghw8sync-runs...
/subscriptions/***/resourceGroups/***/providers/Microsoft.Web/sites/exekiascanaryghw8sync-runs
[09/03/2023 22:46:53] Deleting 8 other resources...
/subscriptions/***/resourceGroups/***/providers/Microsoft.Web/serverFarms/WestEuropePlan
/subscriptions/***/resourceGroups/***/providers/Microsoft.Insights/components/exekiascanaryghw8sync
/subscriptions/***/resourceGroups/***/providers/Microsoft.OperationalInsights/workspaces/exekiascanaryghw8sync
/subscriptions/***/resourceGroups/***/providers/Microsoft.Storage/storageAccounts/exekiascanaryghw
/subscriptions/***/resourceGroups/***/providers/Microsoft.Storage/storageAccounts/exekiascanaryghw8sync
/subscriptions/***/resourceGroups/***/providers/Microsoft.EventGrid/systemTopics/exekiascanaryghw
/subscriptions/***/resourceGroups/***/providers/Microsoft.Batch/batchAccounts/exekiascanaryghw8sync
/subscriptions/***/resourceGroups/***/providers/Microsoft.DocumentDB/databaseAccounts/exekiascanaryghw8sync
Name : exekiascanaryghw8sync-runs
ResourceGroupName : ***
ResourceType : Microsoft.Web/sites
Location : westeurope
ResourceId : /subscriptions/***/resourceGroups/***/provid
ers/Microsoft.Web/sites/exekiascanaryghw8sync-runs
Tags :
Name : Default1ys
ResourceGroupName : ***
ResourceType : Microsoft.Web/serverFarms
Location : westeurope
ResourceId : /subscriptions/***/resourceGroups/***/provid
ers/Microsoft.Web/serverFarms/Default1ys
Tags :
Write-Error: D:\a\_temp\11947b94-db88-4a6c-9e1f-6d3ff5b41d40.ps1:2
Line |
2 | ./cleanup_resource_group.ps1 *** exe …
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| [09/03/2023 23:06:28] Resource group is not empty
It shows that all resources have been successfully deleted and reported absent, but in ~20 min the final Get-AzResource -ResourceGroupName ***
prints 2 resources. The function app is re-created with the same ResourceId
in a new automatic serverFarm
with a strange name Default1ys
.
它显示所有资源都已成功删除并报告不存在,但在大约20分钟内,最终的Get-AzResource-ResourceGroupName*将打印2个资源。在一个新的自动服务器场中使用相同的资源ID重新创建了函数应用程序,并使用了一个奇怪的名称Default1ys。
After a while these resurrected resources can be safely deleted, but the question is how to avoid the resurrection in the first place?
过了一段时间,这些复活的资源就可以安全地删除了,但问题是,如何首先避免复活?
更多回答
Hope your automated script is not scheduled? to run again?
希望您的自动脚本不是计划的?再次参选?
Is App service plan deleted along with the Function app? The fact that the function app is not being destroyed may be due to another resource managing it. By looking at the "managedBy" attribute of the function app, you may determine whether another resource is in charge of managing the function app. You should get rid of the other resource that is in charge of it if it is, then get rid of the function app.
应用程序服务计划是否与功能应用程序一起删除?功能应用程序没有被销毁的事实可能是由于另一个资源在管理它。通过查看函数APP的“Managedby”属性,可以确定是否有其他资源负责管理该函数APP。如果是,你应该去掉负责它的其他资源,然后去掉功能APP。
One more option is to use Complete mode in your deployment. Here's a sample script- New-AzResourceGroupDeployment -ResourceGroupName "myResourceGroup" ` -TemplateFile "C:\Templates\myTemplate.json" ` -TemplateParameterFile "C:\Templates\myTemplateParameters.json" ` -Mode Complete
另一种选择是在您的部署中使用完整模式。以下是一个示例脚本-New-AzResourceGroupDeployment-ResourceGroupName“myResourceGroup”`-TemplateFile“C:\Templates\myTemplate.json”`-Template参数文件“C:\Templates\myTemplateParameters.json”`-模式完成
优秀答案推荐
According to this MS Document when you delete a resource group, Azure Resource Manager decides which resources to remove first. It uses the following order: all the child (nested) resources are deleted, resources that manage other resources are deleted next, and the remaining resources are deleted after the previous two categories.
根据此MS文档,当您删除资源组时,Azure资源管理器将决定首先删除哪些资源。它使用以下顺序:删除所有子(嵌套)资源,接着删除管理其它资源的资源,并且在前两个类别之后删除剩余资源。
It's possible that another resource is managing the function app in your situation, which is why it isn't being destroyed. By looking at the "managedBy" attribute of the function app, you may determine whether another resource is in charge of managing the function app. If another resource is in charge of it, you should get rid of that resource before getting rid of the function app.
在您的情况下,有可能是另一个资源在管理功能应用程序,这就是它没有被销毁的原因。通过查看函数APP的“Managedby”属性,可以确定是否有其他资源负责管理该函数APP。如果是另一个资源在管,你应该先把那个资源去掉,然后再去掉功能APP。
You can try extending the duration for retries or utilising an alternative technique to delete the resources to get around this problem. In order to verify that all resources are destroyed before the deployment is deemed complete, you can also try utilising the "complete" mode for deployment.
您可以尝试延长重试的持续时间,或使用替代技术来删除资源以绕过此问题。为了在认为部署完成之前验证所有资源是否都已销毁,您还可以尝试使用“Complete”模式进行部署。
Powershell code with Complete mode as a parameter:-
以完整模式为参数的PowerShell代码:-
New-AzResourceGroupDeployment -ResourceGroupName "myResourceGroup" ` -TemplateFile "C:Template.json" ` -TemplateParameterFile "C:\TemplateParameters.json" ` -Mode Complete
Run the below code to Deploy Function app and Delete it from Azure Portal like below:-
运行以下代码以部署Function app并将其从Azure门户中删除,如下所示:-
$resourceGroupName = "siliconrg543"
$functionAppName = "silicon-func58"
$storageAccountName = "valleystrg095"
$location = "Australia East"
$runtime = "dotnet"
$functionVersion = "3"
New-AzResourceGroup -Name $resourceGroupName -Location $location
New-AzStorageAccount -ResourceGroupName $resourceGroupName -Name $storageAccountName -Location $location -SkuName Standard_LRS
New-AzFunctionApp -ResourceGroupName $resourceGroupName -Name $functionAppName -StorageAccountName $storageAccountName -Runtime $runtime -Location $location -FunctionsVersion $functionVersion
$resourceGroupName = "siliconrg543"
$functionAppName = "silicon-func58"
Remove-AzFunctionApp -ResourceGroupName $resourceGroupName -Name $functionAppName -Force
Remove-AzResourceGroup -Name $resourceGroupName -Force
Function App Deployed like below:-
Function App部署如下:-
Resources removed and eventually the resource group also got deleted successfully:-
资源已删除,最终资源组也被成功删除:-
Reference:-
参考:-
Remove-AzFunctionApp (Az.Functions) | Microsoft Learn
Remove-AzFunctionApp(Az.Functions)|微软学习
New-AzFunctionApp (Az.Functions) | Microsoft Learn
New-AzFunctionApp(Az.Functions)|微软学习
Short answer: race condition between resource deletion and Policies remediation.
简而言之:资源删除和策略修复之间的竞争状况。
What is happening?
发生了什么事?
- Deployment of a function app indirectly triggers a policy assigned by Azure administrators. In my case the policy is to disable TLS versions prior to 1.2.
- In ~20 min after the initial deployment the policy schedules an incremental deployment to the resource group with the function app Id and appropriate resource property.
- My script deletes the resource.
- The policy deployment, instead of just setting the property on an existing resource, creates a new function app with the same name and sets the property on it.
How to avoid this race condition?
如何避免这种竞争状况?
- Modify the initial deployment to comply with all the policies. The additional deployments will not be scheduled.
- Per SiddheshiDesai answer use empty deployment in complete mode to clean up the resource group. Not only the code is a cleaner, but it also automatically re-deletes the web site that was re-created by policy deployment.
- (?) List and then monitor policy remediations after initial deployments, although the remediation is not meant to be a concern of resource creators. Didn't have any experience.
更多回答
Thank you for the suggestions. Unfortunately, I cannot apply them in my case. Deleting resource group is not an option due to security hardening: no permissions to create a new resource group as part of automation. I didn't find "managedBy" property on the function app and didn't find any examples when/how the property is used. I do use a workaround to retry deletion in a few minutes, but that doesn't feel like a robust solution. There is still no answer why this may happen at all?
谢谢你的建议。不幸的是,我不能在我的案例中应用它们。由于安全强化,不能选择删除资源组:作为自动化的一部分,没有创建新资源组的权限。我没有在Function应用程序上找到“Managedby”属性,也没有找到任何何时/如何使用该属性的示例。我确实使用了一种解决方法,在几分钟内重试删除,但这感觉不是一个强大的解决方案。现在仍然没有答案,为什么会发生这种情况?
Actually, I completely overlooked the option with empty complete deployment! Will try it and report if it works more reliable that manual deletion of resources.
实际上,我完全忽略了完全部署为空的选项!我将尝试它,并报告它的工作是否比手动删除资源更可靠。
我是一名优秀的程序员,十分优秀!