gpt4 book ai didi

azure - 有人成功使用 azurerm_virtual_machine_extension 启用虚拟机诊断吗?

转载 作者:行者123 更新时间:2023-12-03 00:37:01 25 4
gpt4 key购买 nike

在 Azure 中启用 VM 诊断非常痛苦。我已经使用 ARM 模板、Azure PowerShell SDK 和 Azure CLI 使其正常工作。但几天来我一直在尝试使用 Terraform 和 azurerm_virtual_machine_extension 资源为 Windows 和 Linux VM 启用 VM 诊断。还是不行,呃!

这是我到目前为止所得到的(我对此进行了一些调整以简化这篇文章,所以希望我的手动编辑没有破坏任何内容):

  resource "azurerm_virtual_machine_extension" "vm-linux" {
count = "${local.is_windows_vm == "false" ? 1 : 0}"
depends_on = ["azurerm_virtual_machine_data_disk_attachment.vm"]
name = "LinuxDiagnostic"
location = "${var.location}"
resource_group_name = "${var.resource_group_name}"
virtual_machine_name = "${local.vm_name}"
publisher = "Microsoft.Azure.Diagnostics"
type = "LinuxDiagnostic"
type_handler_version = "3.0"
auto_upgrade_minor_version = "true"

# The JSON file referenced below was created by running "az vm diagnostics get-default-config", and adding/verifying the "__DIAGNOSTIC_STORAGE_ACCOUNT__" and "__VM_RESOURCE_ID__" placeholders.
settings = <<SETTINGS
{
"ladCfg": "${base64encode(replace(replace(file("${path.module}/.diag-settings/linux_diag_config.json"), "__DIAGNOSTIC_STORAGE_ACCOUNT__", "${module.vm_storage_account.name}"), "__VM_RESOURCE_ID__", "${local.metricsresourceid}"))}",
"storageAccount": "${module.vm_storage_account.name}"
}
SETTINGS

# SAS token below: Do not include the leading question mark, as per https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/diagnostics-linux.
protected_settings = <<SETTINGS
{
"storageAccountName": "${module.vm_storage_account.name}",
"storageAccountSasToken": "${replace(data.azurerm_storage_account_sas.current.sas, "/^\\?/", "")}",
"storageAccountEndPoint": "https://core.windows.net/"
}
SETTINGS
}

resource "azurerm_virtual_machine_extension" "vm-win" {
count = "${local.is_windows_vm == "true" ? 1 : 0}"
depends_on = ["azurerm_virtual_machine_data_disk_attachment.vm"]
name = "Microsoft.Insights.VMDiagnosticsSettings"
location = "${var.location}"
resource_group_name = "${var.resource_group_name}"
virtual_machine_name = "${local.vm_name}"
publisher = "Microsoft.Azure.Diagnostics"
type = "IaaSDiagnostics"
type_handler_version = "1.9"
auto_upgrade_minor_version = "true"

# The JSON file referenced below was created by running "az vm diagnostics get-default-config --is-windows-os", and adding/verifying the "__DIAGNOSTIC_STORAGE_ACCOUNT__" and "__VM_RESOURCE_ID__" placeholders.
settings = <<SETTINGS
{
"wadCfg": "${base64encode(replace(replace(file("${path.module}/.diag-settings/windows_diag_config.json"), "__DIAGNOSTIC_STORAGE_ACCOUNT__", "${module.vm_storage_account.name}"), "__VM_RESOURCE_ID__", "${local.metricsresourceid}"))}",
"storageAccount": "${module.vm_storage_account.name}"
}
SETTINGS

protected_settings = <<SETTINGS
{
"storageAccountName": "${module.vm_storage_account.name}",
"storageAccountSasToken": "${data.azurerm_storage_account_sas.current.sas}",
"storageAccountEndPoint": "https://core.windows.net/"
}
SETTINGS
}

请注意,对于 Linux 和 Windows,我根据注释从代码库中的 JSON 文件加载诊断详细信息。这些是 Azure 提供的默认配置,因此它们应该是有效的。

当我部署这些时,Linux VM 扩展部署成功,但在 Azure 门户中,扩展显示“在生成的 mdsd 配置中检测到问题”。如果我查看虚拟机的“诊断设置”,它会显示“遇到错误:TypeError:对象不支持属性或方法‘diagnosticMonitorConfiguration’”。Windows VM 扩展完全无法部署,并显示“无法读取配置”。如果我在门户中查看扩展程序,它会显示以下错误:

"code": "ComponentStatus//failed/-3",
"level": "Error",
"displayStatus": "Provisioning failed",
"message": "Error starting the diagnostics extension"

如果我查看“诊断设置” Pane ,它就会挂起一个永无休止的“...”动画。

但是,如果我查看两个虚拟机扩展的“terraform apply”输出,解码后的设置看起来完全符合预期,将配置文件与正确替换的占位符相匹配。

关于如何让它发挥作用有什么建议吗?

提前致谢!

最佳答案

到目前为止,我已经让 Windows 诊断在我们的环境中 100% 正常工作。看来 AzureRM API 对于发送的配置非常挑剔。我们一直在使用 powershell 来启用它,而 powershell 中使用的相同 xmlCfg 不适用于 terraform。到目前为止,这对我们有用:(settings/protected_settings 名称区分大小写!又名 xmlCfg 有效,而 xmlcfg 无效)

main.cf

#########################################################
# VM Extensions - Windows In-Guest Monitoring/Diagnostics
#########################################################
resource "azurerm_virtual_machine_extension" "InGuestDiagnostics" {
name = var.compute["InGuestDiagnostics"]["name"]
location = azurerm_resource_group.VMResourceGroup.location
resource_group_name = azurerm_resource_group.VMResourceGroup.name
virtual_machine_name = azurerm_virtual_machine.Compute.name
publisher = var.compute["InGuestDiagnostics"]["publisher"]
type = var.compute["InGuestDiagnostics"]["type"]
type_handler_version = var.compute["InGuestDiagnostics"]["type_handler_version"]
auto_upgrade_minor_version = var.compute["InGuestDiagnostics"]["auto_upgrade_minor_version"]

settings = <<SETTINGS
{
"xmlCfg": "${base64encode(templatefile("${path.module}/templates/wadcfgxml.tmpl", { vmid = azurerm_virtual_machine.Compute.id }))}",
"storageAccount": "${data.azurerm_storage_account.InGuestDiagStorageAccount.name}"
}
SETTINGS
protected_settings = <<PROTECTEDSETTINGS
{
"storageAccountName": "${data.azurerm_storage_account.InGuestDiagStorageAccount.name}",
"storageAccountKey": "${data.azurerm_storage_account.InGuestDiagStorageAccount.primary_access_key}",
"storageAccountEndPoint": "https://core.windows.net"
}
PROTECTEDSETTINGS
}

tfvar

  InGuestDiagnostics = {
name = "WindowsDiagnostics"
publisher = "Microsoft.Azure.Diagnostics"
type = "IaaSDiagnostics"
type_handler_version = "1.16"
auto_upgrade_minor_version = "true"
}

wadcfgxml.tmpl(为了简洁起见,我删除了一些性能计数器)

<WadCfg>
<DiagnosticMonitorConfiguration overallQuotaInMB="5120">
<DiagnosticInfrastructureLogs scheduledTransferLogLevelFilter="Error"/>
<Metrics resourceId="${vmid}">
<MetricAggregation scheduledTransferPeriod="PT1H"/>
<MetricAggregation scheduledTransferPeriod="PT1M"/>
</Metrics>
<PerformanceCounters scheduledTransferPeriod="PT1M">
<PerformanceCounterConfiguration counterSpecifier="\Processor Information(_Total)\% Processor Time" sampleRate="PT60S" unit="Percent" />
<PerformanceCounterConfiguration counterSpecifier="\Processor Information(_Total)\% Privileged Time" sampleRate="PT60S" unit="Percent" />
<PerformanceCounterConfiguration counterSpecifier="\Processor Information(_Total)\% User Time" sampleRate="PT60S" unit="Percent" />
<PerformanceCounterConfiguration counterSpecifier="\Processor Information(_Total)\Processor Frequency" sampleRate="PT60S" unit="Count" />
<PerformanceCounterConfiguration counterSpecifier="\System\Processes" sampleRate="PT60S" unit="Count" />
<PerformanceCounterConfiguration counterSpecifier="\SQLServer:SQL Statistics\SQL Re-Compilations/sec" sampleRate="PT60S" unit="Count" />
</PerformanceCounters>

<WindowsEventLog scheduledTransferPeriod="PT1M">
<DataSource name="Application!*[System[(Level = 1 or Level = 2)]]"/>
<DataSource name="Security!*[System[(Level = 1 or Level = 2)]"/>
<DataSource name="System!*[System[(Level = 1 or Level = 2)]]"/>
</WindowsEventLog>
</DiagnosticMonitorConfiguration>
</WadCfg>

我终于让 Linux guest 诊断 (LAD) 开始工作了。一些值得注意的事实,与 Windows 诊断不同,设置需要以 json 格式传输,无需使用 Base64 编码。此外,LAD 似乎需要带有存储帐户的 SAS token 。 AzureRM API 对配置的挑剔以及区分大小写的设置的正常警告仍然存在。到目前为止,这对我有用..

# Locals
locals {
env = var.workspace[terraform.workspace]
# Use a set/static time to avoid TF from recreating the SAS token every apply, which would then cause it to
# modify/recreate anything that uses it. Not ideal, but the token is for a VERY long time, so it will do for now
sas_begintime = "2019-11-22T00:00:00Z"
sas_endtime = timeadd(local.sas_begintime, "873600h")
}

#########################################################
# VM Extensions - In-Guest Diagnostics
#########################################################
# We need a SAS token for the In-Guest Metrics
data "azurerm_storage_account_sas" "inguestdiagnostics" {
count = (contains(keys(local.env), "InGuestDiagnostics") ? 1 : 0)
connection_string = data.azurerm_storage_account.BootDiagStorageAccount.primary_connection_string
https_only = true

resource_types {
service = true
container = true
object = true
}

services {
blob = true
queue = true
table = true
file = true
}

start = local.sas_begintime
expiry = local.sas_endtime

permissions {
read = true
write = true
delete = true
list = true
add = true
create = true
update = true
process = true
}
}

resource "azurerm_virtual_machine_extension" "inguestdiagnostics" {
for_each = contains(keys(local.env), "InGuestDiagnostics") ? local.env["InGuestDiagnostics"] : {}
depends_on = [azurerm_virtual_machine_extension.dependencyagent]

name = each.value["name"]
location = azurerm_resource_group.resourcegroup.location
resource_group_name = azurerm_resource_group.resourcegroup.name
virtual_machine_name = azurerm_virtual_machine.compute["${each.key}"].name
publisher = each.value["publisher"]
type = each.value["type"]
type_handler_version = each.value["type_handler_version"]
auto_upgrade_minor_version = each.value["auto_upgrade_minor_version"]

settings = templatefile("${path.module}/templates/ladcfg2json.tmpl", { vmid = azurerm_virtual_machine.compute["${each.key}"].id, storageAccountName = data.azurerm_storage_account.BootDiagStorageAccount.name })
protected_settings = <<PROTECTEDSETTINGS
{
"storageAccountName": "${data.azurerm_storage_account.BootDiagStorageAccount.name}",
"storageAccountSasToken": "${replace(data.azurerm_storage_account_sas.inguestdiagnostics.0.sas, "/^\\?/", "")}"
}
PROTECTEDSETTINGS
}
# These variations didn't work for me ..
# "ladCfg": "${templatefile("${path.module}/templates/ladcfgjson.tmpl", { vmid = azurerm_virtual_machine.compute["${each.key}"].id, storageAccountName = data.azurerm_storage_account.BootDiagStorageAccount.name })}",
# - This one get's you Error: "settings" contains an invalid JSON: invalid character '\n' in string literal or Error: "settings" contains an invalid JSON: invalid character 'S' after object key:value pair

# "ladCfg": "${replace(data.local_file.ladcfgjson["${each.key}"].content, "/\\n/", "")}",
# - This one get's you Error: "settings" contains an invalid JSON: invalid character 'S' after object key:value pair

tfvar

workspace = {
TerraformWorkSpaceName = {
compute = {
# Add additional key/objects for additional Compute
computer01 = {
name = "computer01"
}
}
InGuestDiagnostics = {
# Add additional key/objects for each Compute you want to install the InGuestDiagnostics on
computer01 = {
name = "LinuxDiagnostic"
publisher = "Microsoft.Azure.Diagnostics"
type = "LinuxDiagnostic"
type_handler_version = "3.0"
auto_upgrade_minor_version = "true"
}
}
}
}

如果不将整个内容包装在 jsonencode 中,我就无法使模板文件正常工作。ladcfg2json.tmpl

${jsonencode({
"StorageAccount": "${storageAccountName}",
"ladCfg": {
"sampleRateInSeconds": 15,
"diagnosticMonitorConfiguration": {
"metrics": {
"metricAggregation": [
{
"scheduledTransferPeriod": "PT1M"
},
{
"scheduledTransferPeriod": "PT1H"
}
],
"resourceId": "${vmid}"
},
"eventVolume": "Medium",
"performanceCounters": {
"sinks": "",
"performanceCounterConfiguration": [
{
"counterSpecifier": "/builtin/processor/percentiowaittime",
"condition": "IsAggregate=TRUE",
"sampleRate": "PT15S",
"annotation": [
{
"locale": "en-us",
"displayName": "CPU IO wait time"
}
],
"unit": "Percent",
"class": "processor",
"counter": "percentiowaittime",
"type": "builtin"
}
]
},
"syslogEvents": {
"syslogEventConfiguration": {
"LOG_LOCAL0": "LOG_DEBUG"
}
}
}
}
})}

我希望这会有所帮助..

关于azure - 有人成功使用 azurerm_virtual_machine_extension 启用虚拟机诊断吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53558919/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com