It’s been a month, i was busy to make our environment more stable, a lot of troubleshooting, webex session and discussing. Few days ago I noticed random VMs kept vMotion constantly. Some VMs got strange situation, show orphan, invalid or unknown status, but still online.
I couldn’t find any evidence why the VMs went to these status. One more thing I noticed was CPU and memory utilization of ESXi 5.1 shows 0 on vCenter server 5.1.
Following statement is not mature conclusion, it’s my inference according to DRS, HA and that particular 0 value CPU/memory. I also discussed that with VMware BCS support.
VM changed to abnormal status due to vMotion interrupted by something, more like HA kicked off due to network/storage intermittent failed. That become high chance since DRS kept try move heavy workload VM to 0 CPU/memory host.
You have to upgrade to ESXi 5.1 latest version or vCenter Server 5.1 update 1c to permanent fix this problem.
Choose one option from following options, that’s temporary solution, issue will present again.
1. Restart ESXi management agent.
2. Disconnect/reconnect ESXi on vSphere client.
Update: you have to upgrade ESXi host and vcenter server both to permanent fix the problem.
I have to say you’ll not able to get what you anticipating if you follow VMware document. After referred few blogs and videos, I finally deployed the production in HA and DR mode both, it consumed a lot of time since I had to clone the VM from US to India over WAN. It’s pain, I’d like the share it to make sure you never fall in same situation.
Install vCenter Server and components on Primary Server, Secondary Server will be cloned.
vCenter Update Manager, vCenter Converter, ESXi Dump Collector, Syslog Collector are configured using Fully Qualified Domain Names (FQDN) rather than IP addresses.
Time Zone and time setting is correct.
Port 52267 and 57348 is enabled in firewall on both servers.
2GB free memory available for vCenter Server Heartbeat.
Administrator right is required to install vCenter Server Heartbeat.
All vCenter Server components should functionally before install vCenter Server Heartbeat.
No * in SSO master password. ( I guess that’s a bug of 5.6U1, please refer to KB2034608 to reset master password )
vCenter Server FQDN is Primary Server computer name. ( It will be changed later )
Pre-configure before install vCHB:
Make sure Primary Server computer name is vCenter Server FQDN.
Change vCenter Server services to manually start up on Primary Server.
VMware VirtualCenter Server
VMware vSphere Profile-Drive Storage
vCenter Inventory Service
VMware VirtualCenter Management Webservices
Recovery system fingerprint encrypted file.
Go to C:\Program Files\VMware\Infrastructure\SSOServer\utils
Recovery footprint by following command: rsautil manage-secrets -a recover -m SSO Master Password
Power off Primary Server
Clone Primary Server to secondary site.
Disconnect vNICs on Secondary Server.
Power on both servers and set IP addresses.
I use two vNICs on each server, one for Public Network, another for VMware Channel Network. Public Network contains two IP address, one for Management Network, another for Principle Network. Principle Network on both should be same if you deploy HA mode, otherwise they are different for DR mode.
Disable NETBIOS and DNS Register on each vNIC.
Leave domain and rename Secondary Server.
Reboot Secondary Server and connect vNICs.
Join Secondary Server back to domain and add proper AD groups to Administrator group.
Note: You probably need to re-join domain twice to make sure AD synchronization correct, I got vCenter Server startup issue in initially deployment due to AD synchronization issue.
Create a share folder on reliable server that Primary and Secondary Server both can access.
Make sure configured IP addresses pingable from each server.
Bring up vCenter Server services on Primary Server.
Select Install VMware vCenter Server Heartbeat to start installation.
Select Primary to install vCHB on Primary Server.
Apply license key.
Select LAN or WAN according to your architecture.
Select Secondary Server is Virtual option. ( I only tested that option )
Confirm installation path.
Select vNIC for VMware Channel network.
Enter VMware Channel IP addresses of Primary and Secondary Server.
For HA mode, you could use non-routable or routable IP address.
For DR mode, you must use routable IP addresses to make sure VMware Channel network can communicate each other over WAN.
Select vNIC for Public Network.
Enter IP addresses of Principal Network for both server.
For HA mode, IP address should be same on both server.
For DR mode, IP addresses should be different, you have to enter manually.
Select the options accordingly.
If you select Different IP addresses in step above, you will need to enter a DNS update account of Windows. ( Refer to KB1008605 if you use BIND9 DNS instead of Windows DNS service )
Then configure Management Network. This network is used for RDP.
Rename computer name of both server. It looks like only rename Primary Server, no change for Secondary Server, but you don’t have to worry about that since we already renamed Secondary Server in early step.
Set client port, I used default.
Select components you want to protect and enter vCenter Login, this Login must have Administrator right on vCenter Server.
Also input SSO master password, please note the SSO master password may different with SSO administrator password, please make sure you enter correct password.
Enter the share path you created earlier, this folder will store cluster configuration information for Secondary Server installation.
vCHB start checking system.
You will lost RDP connectivity for 10 seconds during installation due to Package Filter installation.
Once the installation complete, you can start on Secondary Server, just make sure you select Secondary.
All other steps is similar like Primary Server.
Startup vCHB services on Secondary Server.
Open vCenter Server Heartbeat Management Console.
Add each node by Management Network.
Wait a while, you will see similar screen like following screenshot.