It’s been a month, i was busy to make our environment more stable, a lot of troubleshooting, webex session and discussing. Few days ago I noticed random VMs kept vMotion constantly. Some VMs got strange situation, show orphan, invalid or unknown status, but still online.
I couldn’t find any evidence why the VMs went to these status. One more thing I noticed was CPU and memory utilization of ESXi 5.1 shows 0 on vCenter server 5.1.
Following statement is not mature conclusion, it’s my inference according to DRS, HA and that particular 0 value CPU/memory. I also discussed that with VMware BCS support.
VM changed to abnormal status due to vMotion interrupted by something, more like HA kicked off due to network/storage intermittent failed. That become high chance since DRS kept try move heavy workload VM to 0 CPU/memory host.
You have to upgrade to ESXi 5.1 latest version or vCenter Server 5.1 update 1c to permanent fix this problem.
Choose one option from following options, that’s temporary solution, issue will present again.
1. Restart ESXi management agent.
2. Disconnect/reconnect ESXi on vSphere client.
Update: you have to upgrade ESXi host and vcenter server both to permanent fix the problem.
Few weeks ago, I tried to standardize networking of a cluster, there were 4 VLANs for production virtual machines, I binded the VLANs on one virtual switch which had 4 physical vmnic.
Then I created 4 port groups with different VLAN ID, but for some reason virtual machines unreachable via some vmnics. Network team verified port channel was good.
I tried on several ESXi 5.0 hosts in the cluster, all had same problem, finally we found that’s a Cisco switch bug….you could find detail information and work around here.
If you installed “HP ESXi 5.0 Complete Bundle Update 1.6” via Update Manager 5.0, you would be able to see storage and power sub-system shows warning on HP server, that’s because some parameters show NULL in updated HP SIM provider.
CreationClassName = HPVC_SAController
Name = vmwControllerHPSA1
PowerManagementCapabilities = (NULL)
ResetCapability = (NULL)
OtherDedicatedDescriptions = (NULL)
Dedicated = (NULL)
NameFormat = (NULL)
TransitioningToState = 12
AvailableRequestedStates = (NULL)
TimeOfLastStateChange = (NULL)
EnabledDefault = 2
RequestedState = 12
I think HP has called back the bundle, you may see similar error message below if you already download the patch and upgrade to Update Manager 5.1 then.
VMware vSphere Update Manager had an unknown error. Check the events and log files for details.
After upgrade to Update Manager 5.1
Cannot download software packages from patch source. Check the events and the Update Manager log for download details.
After remove "data" folder in Update Manager 5.1
No way to avoid the error message except filter your baseline to exclude HP patches.
Another blogger also described same situation here.