linux - Azure虚拟机上使用cloud-init挂载数据盘失败

标签 linux azure ubuntu azure-virtual-machine cloud-init

这是一个与之前的SO问题类似的问题,我从中调整了我的代码How can i use cloud-init to load a datadisk on an ubuntu VM in azure

使用通过 Terraform 传递的云配置文件:

#cloud-config
disk_setup:
  /dev/disk/azure/scsi1/lun0:
    table_type: gpt
    layout: true
    overwrite: false

fs_setup:
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
  - [
      "/dev/disk/azure/scsi1/lun0-part1",
      "/opt/data",
      auto,
      "defaults,noexec,nofail",
    ]
data "template_file" "cloudconfig" {
  template = file("${path.module}/cloud-init.tpl")
}

data "template_cloudinit_config" "config" {
  gzip          = true
  base64_encode = true

  part {
    content_type = "text/cloud-config"
    content      = "${data.template_file.cloudconfig.rendered}"
  }
}

module "nexus_test_vm" {
  #unnecessary details ommitted - 1 VM with 1 external disk, fixed lun of 0, ubuntu 18.04
  vm_size            = "Standard_B2S"

  cloud_init_template = data.template_cloudinit_config.config.rendered
}

模块的相关位(VM创建)

resource "azurerm_virtual_machine" "generic-vm" {
  count               = var.number
  name                = "${local.my_name}-${count.index}-vm"
  location            = var.location
  resource_group_name = var.resource_group_name

  network_interface_ids         = [azurerm_network_interface.generic-nic[count.index].id]
  vm_size                       = var.vm_size
  delete_os_disk_on_termination = true

  storage_image_reference {
    id = var.image_id
  }

  storage_os_disk {
    name              = "${local.my_name}-${count.index}-os"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
    disk_size_gb      = var.os_disk_size
  }

  os_profile {
    computer_name  = "${local.my_name}-${count.index}"
    admin_username = local.my_admin_user_name
    custom_data    = var.cloud_init_template
  }

  os_profile_linux_config {
    disable_password_authentication = true

    ssh_keys {
      path = "/home/${local.my_admin_user_name}/.ssh/authorized_keys"
      //key_data = tls_private_key.vm_ssh_key.public_key_openssh
      key_data = var.public_key_openssh
    }
  }

  tags = {
    Name        = "${local.my_name}-${count.index}"
    Deployment  = local.my_deployment
    Prefix      = var.prefix
    Environment = var.env
    Location    = var.location
    Volatile    = var.volatile
    Terraform   = "true"
  }
}

resource "azurerm_managed_disk" "generic-disk" {
  name                 = "${azurerm_virtual_machine.generic-vm.*.name[0]}-1-generic-disk"
  location             = var.rg_location
  resource_group_name  = var.rg_name
  storage_account_type = "Standard_LRS"
  create_option        = "Empty"
  disk_size_gb         = var.external_disk_size
}

resource "azurerm_virtual_machine_data_disk_attachment" "generic-disk" {
  managed_disk_id    = azurerm_managed_disk.generic-disk.id
  virtual_machine_id = azurerm_virtual_machine.generic-vm.*.id[0]
  lun                = 0
  caching            = "ReadWrite"
}

我收到很多奇怪的错误,表明在 cloud-init 运行时磁盘不存在。但是,当我通过 ssh 进入虚拟机时,磁盘就在那里!这是竞争条件吗?我是否可以在 cloud-init 中配置等待或其他东西,以便让我更好地了解可能发生的情况?

来自虚拟机的相关日志:

head -n 5000/var/log/cloud-init.log | grep lun

2020-04-07 16:30:51,296 - cc_disk_setup.py[DEBUG]: Partitioning disks: {'/dev/disk/azure/scsi1/lun0': {'layout': True, 'overwrite': False, 'table_type': 'gpt'}, '/dev/disk/cloud/azure_resource': {'table_type': 'gpt', 'layout': [100], 'overwrite': True, '_origname': 'ephemeral0'}}
2020-04-07 16:30:51,318 - util.py[DEBUG]: Creating partition on /dev/disk/azure/scsi1/lun0 took 0.021 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,601 - cc_disk_setup.py[DEBUG]: setting up filesystems: [{'device': '/dev/disk/azure/scsi1/lun0', 'filesystem': 'ext4', 'partition': 1}]
2020-04-07 16:30:51,725 - util.py[DEBUG]: Creating fs for /dev/disk/azure/scsi1/lun0 took 0.124 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,733 - cc_mounts.py[DEBUG]: mounts configuration is [['/dev/disk/azure/scsi1/lun0-part1', '/opt/data', 'auto', 'defaults,noexec,nofail']]
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Attempting to determine the real name of /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: changed /dev/disk/azure/scsi1/lun0-part1 => None
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Ignoring nonexistent named mount /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,736 - cc_mounts.py[DEBUG]: Changes to fstab: ['+ /dev/disk/azure/scsi1/lun0-part1 /opt/data auto defaults,noexec,nofail,comment=cloudconfig 0 2']

ls -l/dev/disk/azure/scsi1/lun0

lrwxrwxrwx 1 root root 12 Apr  7 16:32 /dev/disk/azure/scsi1/lun0 -> ../../../sdc

最佳答案

对于这个问题,我认为是数据盘、VM、cloud-init的顺序。据我所知,cloud-init是在虚拟机首次启动时执行的。而且你创建的Terraform文件看起来数据盘可能晚于VM创建,所以也晚于cloud-init,所以导致了错误。

因此,解决方案是使用 storage_data_disk block 在虚拟机内部设置数据磁盘,以便在创建虚拟机时附加数据磁盘,然后执行 cloud-init。

关于linux - Azure虚拟机上使用cloud-init挂载数据盘失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61085490/

相关文章:

C:尝试使用 sigaction 恢复信号处理程序但未成功。

linux - 当 read = 0 时停止读数

azure - Shibboleth SP 注销不会重定向到我正确的服务器 URL

azure - 正在为 Azure 应用服务中的多个虚拟应用程序配置 AspNetCoreModuleV2

linux - Docker 运行 - 用户组没有按预期工作?

java - 找不到 studio.sh 命令 - Ubuntu

linux - 从文件夹中选取最新文件

linux - 从桌面快捷方式调用 shell 脚本时未读取 .bashrc

c++ - Linux相当于windows的EnterCriticalSection

azure - 为什么我收到错误 "The target GatherAllFilesToPublish does not exist"?