Tag Archives: VM

Maximum Supported Boot Devices in Virtual Machine BIOS

Noticed a interesting limitation on VMware virtual machines. If you configure multiple SCSI controllers and distribute more than  8 virtual  disks. You may experience randomly OS boot up failure when power cycle VMs. Only last 8 disks with higher SCSI ID present in boot order settings of BIOS. You cannot choose the disks with lower SCSI ID.

You need to following up VMware KB “Changing the boot order of a virtual machine using vmx options (2011654)” to force virtual machines boot up on proper SCSI node.

Advertisements

Report snapshots older than X days in vROps

There are several ways to reporting snapshots. You can use PowerCLI, VRTools, or even vSphere Client itself. Today I will show you how to reporting by vRealize Operations Manager (vROps) 6.x. The benefit of  vROps reports is you can schedule it by sending email with PDF & CSV reports. I found a post discuss how to reports by vROps. But it requests modify policy, it may impacts global calculation. What if your teams request different criteria for reports?

Continue reading

How to Automate Snapshot on Virtual Machine

I always treat virtual machine snapshots like a big risk. It caused several outages in our infrastructure. Please check out Best practices for virtual machine snapshots in the VMware to understand how it impacts production.

虚拟机快照对我来说绝对是个大威胁,已经在我的生产环境里发生过好几次由此引发的故障了。如果你要了解快照对生产环境的影响可以看看:Best practices for virtual machine snapshots in the VMware

Continue reading

Device or Resource Busy

You may read my post How to find which ESXi 5.1 host lock the VM, it’s a solution to figure out which host lock down a file.

But sometimes you may face similar problem but different solution.

You are able to browse the file by CLI or GUI, but cannot delete by either way. It returns you device or resource busy or similar error messages.

You could try following command to delete the file/folders:

rm [File or folder name] -rf

Virtual Machine Disappeared on vCenter Inventory

Just googled this issue, some of people got similar problem, following was my problem and solution, hope it’s useful for you.

Virtual machine maybe lost on vCenter Server inventory for some reason, it’s also disappeared on ESXi inventory when you connect to the individual host directly. You are able to find the VM process by esxcli vm process list, but not able to get it by vim-cmd vmsvc/getallvms.

If you try load the VM manually by commeand vim-cmd vmsvc/reload [vmid], it back you error:

(vim.fault.NotFound) {
dynamicType = <unset>,
faultCause = (vmodl.MethodFault) null,
msg = “Unable to find a VM corresponding to “vmid””,
}

You can also find the VM process by esxtop command.

My solution was remove the ESXi host from vCenter Server, then restart management services, and then browse the VM folder, add the VM back to inventory, then join the host back to vCenter Server.

 

Error 12711 VMM cannot complete the WMI operation on the server because of an error

Finally I implemented Hyper-V 2012 and SCVMM 2012 R2 on my lab, unfortunately FreeNAS does not supports SCSI-3 persistent reservation of Windows Server 2012 R2, you can refer bug #4003. It lead to my iSCSI storage cannot be brought online in Failover Cluster. I have to find out alternative.

I decided to use Windows Server File Server instead of iSCSI eventually. There are bunch of benefit to use that to leverage new SMB 3.0 technology. Key is it supports high available.

Followed the guide I successful created first shares for Hyper-V cluster, I created a testing VM but cannot power it on. It show me:

Error (12711)
VMM cannot complete the WMI operation on the server (dcahyv02.contoso.com) because of an error: [MSCluster_Resource.Name=&quot;SCVMM test (1)&quot;] The cluster resource could not be brought online by the resource monitor.

The cluster resource could not be brought online by the resource monitor (0x139A)

Recommended Action
Resolve the issue and then try the operation again.

I went to cluster service manager on a Hyper-V host, event logs show me:

Cluster resource ‘SCVMM test (1)’ of type ‘Virtual Machine’ in clustered role ‘SCVMM test Resources (1)’ failed. The error code was ‘0x80004005’ (‘Unspecified error’).

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Initially I suspected that’s a problem of new file server, or SCVMM bug. But problem was it cannot be brought up even I created a VM on Hyper-V host directly. It gave me this error:

Virtual machine ‘test’ could not be started because the hypervisor is not running (Virtual machine ID AE786CAA-C74B-4F9E-8867-30191197087B). The following actions may help you resolve the problem: 1) Verify that the processor of the physical computer has a supported version of hardware-assisted virtualization. 2) Verify that hardware-assisted virtualization and hardware-assisted data execution protection are enabled in the BIOS of the physical computer.  (If you edit the BIOS to enable either setting, you must turn off the power to the physical computer and then turn it back on.  Resetting the physical computer is not sufficient.) 3) If you have made changes to the Boot Configuration Data store, review these changes to ensure that the hypervisor is configured to launch automatically.

Hypervisor is not running… it indicates something related to virtualization layer. I finally realized that I was running Hyper-V host on VMware Workstation on a laptop, it’s twice nested VM, something wrong!

I followed my article How to configure nested Hyper-V VM on VMware Workstation to fix this problem.

How to Configure Serial Console for VM by Avocent ACS v6000 Virtual Advanced Console Server

Serials console is very helpful to troubleshooting Linux problem, you can see additional system message via serial console if your Linux server hung. It is essential component on physical server for troubleshooting. It’s challenge to manage serial consoles if your datacenter is very big. You may deploy console server for central management of serial consoles, you don’t have to connect your computer with serial console one by one, you just need connect console server IP follow with port name by telnet protocol.

Time comes to today, virtualization world. How you connect serial console of Linux virtual machine? Can we do exactly same like physical server? Answer is YES! There is couple of way to connect serial console of VM, each way has different benefit. I’m going to introduce the best one!

VMware has a KB article 1022303 introduces how to implement virtual console server, but it’s not very clearly, I went to wrong way by follow up the KB.

Deploy Avocent ACS v6000 virtual advanced console server

1. Download the software image from Emerson website.

2. Install the software on console server VM by follow up ACS v6000 Installer/User Guide.

Configure Linux VM serial console

1. Add a serial port to target Linux VM you want to use serial console.

2. Configure the serial port, Select Use Network option.

3. Select Client (VM initiates connection) option.

4. Input ACSID in Port URI field.

5. Select Use Virtual Serial Console Concentrator option.

6. Input telnet://console server ip:8801 in vSPC URI field.

7. Select Yield CPU on poll option.

8. Make sure Connected and Connect at power on options are selected.
Note: It indicates wrong setting on serial port if Connected option goes back to deselect status automatically after you save the setting.

Enable kernel message on Linux VM

1. Login to your target Linux VM by SSH.

2. Copy following strings to SSH, it will enable kernel message on serial console.
cat <<EOFEOF > /etc/init/serial-ttyS0.conf

# This service maintains a getty on /dev/ttyS0.

start on stopped rc RUNLEVEL=[2345]

stop on starting runlevel [016]

respawn

exec /sbin/agetty /dev/ttyS0 115200 vt100-nav

EOFEOF

3. Run following command.
initctl start serial-ttyS0

Enable serial console on Linux VM

1. Edit grub.conf by following command.
vi /boot/grub/grub.conf

2. Add following lines after hiddenmenu option.
serial –unit=1 –speed=19200

terminal –timeout=8 console serial

3. Add following line in each kernel line.
console=tty0 console=ttyS0,115200

4. Reboot VM.

Configure ASC v6000 console server

1. Login management website of ASC v6000 console server.

2. Go to PortsSerial Ports.

3. Enable ttyS1 device.

4. Go to Access option, you will see the serial console is automatically mapped to serial console of target Linux VM.

Validation

1. Login serial console of target Linux VM via console server by telnet, SSH or serial viewer.

2. Login SSH of target Linux VM directly.

3. On SSH session, run following command to trigger kernel message.
echo h > /proc/sysrq-trigger

4. You will see message on serial console screen.

Unable to load status of objects in vCenter Server 5.1

On today’s troubleshooting, I faced a very weird problem. vCenter Server services were up and running fine, it’s able to connect by vSphere client, but VMs, hosts show gray, and I cannot power on VM via PowerCLI.

After went through each components of vCenter Server, I noticed the database size was 230GB, but only ~20 hosts were there. So I asked DBA team truncat event tables and shrink database. Issue gone after database optimization.

You may want to do same if you face similar issue.

How to Upgrade Virtual Hardware on MSCS VM

We get more new cool feature if keep virtual hardware up to date. And you may face boot problem when upgrade lower virtual hardware version to latest.

I always keep my Microsoft Cluster Services VM (MSCS VM) up to date since RDM disk usually uses on that kind of VMs.

I tried to search how to upgrade virtual hardware on MSCS VM with RDM LUN, but no lucky. That’s my experience:

  1. Update manager doesn’t work for MSCS VM.
  2. No snapshot would be taken if your SCSI controller of RDM is physical mode, you should have a good backup before upgrading.
  3. It’s possible to force upgrade hardware version by right click VM and select Upgrade Virtual Hardware.
  4. Make sure all services are running on another node.
  5. You will get following error message on Event for RDM disks in vSphere Client, upgrading procedure won’t be finished until error pop out for all RDM disks.
  6. I tried upgrade version 7 to 8.

A disk read error occurred after upgrade HW version from 3 to 9

This was a lesson and learns for me after I recovered the data back. My data was lost and no backup…

I had a virtual machine was moved from ESX 3.0 to ESXi 5.1 host long time ago. The virtual disk size show 0 and I cannot do storage migration and snapshot on the VM due to the hardware version was 3, it’s too low.

Generally I take snapshot before upgrade VM HW version, but that’s impossible on a VM of HW version 3 that running on vCenter Server 5.1. So I upgraded the VMware Tools and then VM hardware version by Update Manager. VMware Tools was successfully upgraded, but VM hardware version upgrading got error.

Then I right clicked the VM and used “Upgrade Hardware Version” option directly, it’s successfully without any prompt…finally I got “A disk read error occurred” when boot up. L

You may think it’s caused by SCSI controller since VM hardware version 3 supports IDE virtual disk and version 9 supports only SCSI virtual disk for best performance. That’s not my case. I tried several way to recover the disk, like convert the VM by convertor, mount the disk to other virtual machine, change SCSI parameter…etc.

I don’t think hardware version upgrading changes real virtual disk too much, it must be something changed on the head section of virtual disk, or description file. After consulted with Microsoft we got it fixed finally.

When I mounted the corrupted disk on other virtual machine, partition and size was recognized correctly. And disk manager also can recognize the NTFS file system. I can saw new drive appear in My Computer as well, but it show me “File or directory corrupted…” when I tried to open the drive. It more like a file system issue… it’s easy, just run following command to check any logical errors:

Chkdsk [drive letter]

Wow….a lot of error and files was listed, then I tried command:

Chkdsk /f [drive letter]

That’s real fix logical issue of disk. I could open the drive after used this command.

I mounted the drive back to the broken VM and powered on. New issue came up…Windows show me “Windows NT could not start because the below file is missing or corrupt: C:\Windows\System32\Ntoskrnl.exe”. I replaced the file but no help. The file was existed in the location, and file size was same like other VM, it’s perhaps not file issue?

Then I open VMDK file, aha….ddb.adapterType = “LegacyESX”, changed it to ddb.adapterType = “lsilogic” according to my SCSI controller set, my lovely Windows Server startup screen came back again. J

Okay, I talked too much. To summarize the fixing steps:

  • Mount the broken disk to a good virtual machine with same operating system. ( I’m not sure is it ok to mount on higher version of OS )
  • Run chkdsk [drive letter] to check if logical error existing.
  • Run chkdsk /f [drive] letter] to fix the logical error.
  • Unmounts the disk from good VM.
  • Edit the VMDK file in ESXi console.
  • Change the value of ddb.adapterType to proper SCSI controller type according to your SCSI controller setting.
  • Mount the disk back to broken VM.
  • Power on.

Here is my learning from that contingence:

  1. vCenter Server does not verify compatibility of VM hardware version during upgrading. Actually it’s not allowed to upgrade VM version from 3 to 9 directly.
  2. vCenter Server does not allowed you choose which VM hardware version you want to upgrade to, always latest.
  3. If you upgrade VM version from 3 to 9 directly, a SCSI controller will be added to the VM, value of ddb.adapterType will be changed to LegacyESX. You will not able to boot up the VM due to Windows Server 2003 does not contain proper SCSI driver.
  4. VM version upgrading looks like changes parameters of VMDK file but don’t change too much of real virtual disk, such as NTFS mapping and MBR table…etc.

Last, you may still face BSOD after use above solution since item 3 above, you have to inject the SCSI driver, please refer to KB 1005208 and KB 1006858.

Last of last…. :-) please take a backup of your virtual disk before you do any change!!!!!