How to get HBA WWPN of ESXi hosts

It’s busy month, I haven’t update my blog since I back from Phuket with my wife. I’m running into multiple projects, a little overload.

Just a quick share, my storage team ask me provide WWPN of all hosts to do a health check. it’s nightmare to pull out the data from vSphere client or web client. Just found a way to get it.

Get-VMHost -Location | Get-VMHostHBA -type fibrechannel | select VMHost,Device,@{N=”WWPN”;E={“{0:X}” -f $_.PortWorldWideName}}

Especially “{0:X}” -f $_.PortWorldWideName}

{0:X} is format, check out here to find more.

-f is kind of pipeline.

$_.PortWorldWideName is the value you want to convert.

Receive Side Scaling on UCS Blades

To implement enterprise application like SAP, Oracle or SQL on UCS virtualization environment. Default setting of UCS blades may not suitable for the application. We always expect highest performance by optimize hardware and ESXi. In my UCS training session, I noticed one “hidden” parameter may helpful for performance.

Receive Side Scaling – So called RSS, it’s a feature that allows you to utilize multiple CPUs and multiple cores per CPU to process the receiving network load. Without RSS, all of the receive network traffic is processed by one CPU and by only one core of the CPU. Essentially, RSS distributes receiving network load to all of the CPUs and their cores.

The parameter is an option in BIOS, but it’s not under BIOS policy in UCS Manager. You should go to Servers tab, extend Policies node, and check an Eth Adapter Policy under Adapter Policy node, Receive Side Scaling (RSS) is available in Options section of right frame. Blade should be rebooted to leverage the option.

Please keep in mind that do not enable RSS if your adapters more than your CPUs, it will cause unexpected network transmit failed. RSS option must be enabled on UCS policy before enable it on OS layer (I confirmed with Cisco TAC, is that true?). Regarding OS layer, please refer to those articles.

Receive side scaling on Intel® Network Adapters

How to enable Receive Side Scaling on Microsoft Windows Server 2008 R2

You don’t have to enable the option if network traffic is not a concern.

How to Install Proper Drivers for 3rd Party Network Adapter on ESXi 5.x

Most company use HP, Cisco, IBM network adapter, most of their network adapter drivers include in ESXi 5.x images. But what if your network adapter is other vender? I’m going to show you how to identify and install proper drivers for 3^rd party network adapter on ESXi 5.x.

My ESX 4.0 server (HP DL380G7) was working properly for VDI environment. I upgraded the hosts to ESXi 5.x to leverage by new features, but unfortunately 3^rd party network adapter didn’t work after installed ESXi 5.1. vmnic0, 1, 2, 3 belong to embedded HP NIC, vmnic4, 5, 6, 7 disappeared in vSphere Client. I’m going to use vmnic4 for example.

Identify network adapter model

The additional NICs model shows as ServerEngine Corp. OneConnect 10Gb NIC on vSphere Client. It doesn’t give me more information.

The two NICs show Unknown PCI device in BIOS, it indicates not a HP NIC.

Search keyword vmnic4 in vmkernel log by command less /var/log/vmkernel | grep vmnic4, you will see ESXi cannot load the driver.

Run command vmkchdev -l |grep vmnic4 to get similar output below:

002:01.0 19a2: 0700 10df: e622 vmkernel vmnic0

In this example, the values are:

VID = 19a2

DID = 0700

SVID = 10df

SDID = e622

Match the IDs in VMware hardware compatibility guide – IO device.

It indicates the network adapter is Emulex OneConnect OCe10102-N

Upgrade firmware and drivers

Looks like that’s a native Emulex network adapter, so let’s search Emulex website….

Driver downloads for VMware vSphere 5.1 states “You must update the firmware and boot code on the OCe11102 and the OCe10102 UCNAs when you install the driver.”

Go to firmware downloads page and download the Off-line One Connect Flash ISO version 4.6.142.0, mount to HP server via iLO and reboot.

Flash Utility finds the network adapters and ask continue, press Y button to start firmware upgrading.

Run command reboot after upgrading complete.

Driver of network adapter actually already includes in ESXi 5.1, so you don’t have to upgrade again, you can run command esxcli software vib list | grep be2net to find out the installed driver version.

Generic Trust Failure when install SCVMM 2012 SP1

Today I got a special problem I want to share with you. I tried to install SCVMM 2012 SP1 console on my Windows 7 VM to do some troubleshooting, but I get error message “Generic Trust Failure” when I click Install button in SCVMM 2012 SP1 installer, it mentioned something related to Microsoft Visual C++ 2010 x86 Redistributable.

I tried to run Microsoft Visual C++ 2010 x86 Redistributable installer from image folder directly, it show me exactly same error message. Nothing I found on google, but most posts pointed to signature.

After deep dive into the problem, I figured out a solution:

Go to PrerequisitesVCRedisti386 folder of SCVMM 2012 SP1 image.
Copy vcredist_x86.exe to local disk.
Extract the executable file to a folder. (You have to install WinZip or something else to do that)
Enter the extracted folder, right click Setup.exe.
Select Properties.
Go to Digital Signatures tab.
Highlight the certification and click Details.
Click View Certification button on pop-up window.
Click Install Certificate button.
Process the wizard by default option.

Troubleshooting of Microsoft product is much different with Linux, you have to dividing and conquering, deep dive into each elements of the product, read carefully of each logs, then you will find root cause.

Please let me know if you have better solution. 🙂

HP Blade Firmware Upgrading Best Practices for ESXi Host

I discussed this topic with a group, some people think firmware upgrade is not required if ESXi host working fine, that’s adapted to small business, but I think enterprise can do more better.

My ESXi running on HP blades, I’ll use that platform for example to share my thought and experience.

Why you need a plan for HP blade firmware upgrading of ESXi host?

First voice around my head is “We suggest you upgrade firmware to latest version”. You may experience similar like me when you call HP for helping, that’s look like HP official statement whenever we suspect a problem related to hardware. 😉 You know how hard to upgrade bulk of ESXi hosts to troubleshooting a network/storage problem, especially your hosts are running on older version, it may be extremely time consuming. So keep firmware up to date will save troubleshooting time, also make your life easy. 🙂

Even no issue on hardware, you may still need to upgrade software, it’s rarely but some maybe conflict with old firmware, and in this scenarios please consider significantly downtime when you have to upgrade firmware if your server is running on older version.

Reboot is required for most firmware upgrading,

HP blade firmware upgrading tools for ESXi host

HP is right statement, their firmware has lifecycle, and the official HP policy is only to support updating to a new version that is two versions newer than the currently installed version.

Recently HP is replacing old firmware tools by HP Service Pack for ProLiant (SPP). SPP is an all in one image file includes firmware, drivers and management tools for ProLiant servers. Thanks HP, it’s pretty confuse when I upgrade by old way, now it’s easy to know which firmware level your servers exactly on.

You can upgrade ESXi host by two ways below. Online upgrading is recommended. Refer to
HP ProLiant Gen8 and later Servers – Understanding the Differences between Online and Offline Modes in HP SUM

Online upgrading – ESXi 5.x first time supports online firmware upgrading, that’s really benefit for production ESXi host. But on other side SPP doesn’t support online upgrading for all components on ESXi host, such as power management, and you have to install HP customized ESXi to use online upgrading.

Offline upgrading – offline upgrading is convention for all OS, ~30 minutes downtime is required for each blade.

You can click here for more detail of SPP.

Best practices for HP blade firmware upgrading

I’m using it now, it may give you some idea of how to plan firmware upgrading for ESXi host.

Before implement firmware

Ensure HBA firmware is supported by storage vendor.
Ensure NIC firmware is supported by OS and switch.
Please check VMware compatibility guide.
Create SPP server.
You may have multiple Datacenter on different location. You have to prepare servers on each location to store SPP image, it reduces SPP image load time from local server.
Create firmware baselines.
You may want to keep ESXi host firmware up to date, I suggest creating a baseline, all ESXi host must be upgraded to exactly same firmware base on baseline. Enterprise datacenter may has thousands ESXi host, unified firmware will make it more stable. Your troubleshooting also more efficiency since it’s possible to identify hardware issue quickly.
Create rollback plans.
HP firmware can be force rollback, but not 100% successful, you can prepare alternative, such as vendor support after upgrading failed, data recovery from tape…etc.
Create update plan.
Which SPP will you use?
Which ESXi version should be along with the baseline?
How you upgrade ESXi host?
Create testing environment.
I would recommend perform testing if you want upgrade all smoothly. As least run the upgrading on one ESXi host and keeps it running 72 hours, monitor vmkernel log in case any issue.
Generate firmware report.
A firmware report is required to understanding the whole picture.
You can generate the reports by native HP SUM (Smart Update Manager) in SPP image, or you can download SUM from HP website and run on a server, native version has problem to generate reports for some blade model, so latest version is preferred.
Identify hotfixes and critical advisories.
Read SPP release notes and HP CA to understand known issue and work around will make your IT life beautiful. 🙂

Pre-check before upgrade OA/VC

HP blade is installed on enclosure, it managed by enclosure Onboard Administrator (OA) and interactive with network/storage via virtual connect module (VCM). Blade firmware should compatible with OA and VCM firmware version as well.

Before the upgrading you should spend some time to verify enclosure health and version by following steps.

Perform a health check on the VC modules by Virtual Connect Support Utility.
If OA firmware is 1.x, it must be updated to 2.32 before updating to newer versions.
If VC firmware is greater than 3.00, then OA must be 3.00 first.
Run HP Virtual Connect Pre 3.30 Analyzer if VC version is 3.x and upgrade to 3.3.
Make sure that the VC modules are set up in a redundant configuration. Stack link should be configured.

You also need to make sure blade drivers is updated by same SPP image before upgrading.

Firmware upgrading

As I mentioned above, blade firmware should compatible with OA/VCM firmware, upgrade sequence is very important, blade may lost communication with OA/VCM if you upgrade by wrong sequence.

If VC earlier than 1.34:
Sequence is VC -> OA -> Blade.
If VC 1.34 or later:
Online mode sequence is OA -> Blade -> VC. (This is for firmware upgrading by SPP image.)
Offline mode sequence is OA -> VC -> Blade. (This is for upgrading under CLI or offline mode.)
Insert the SPP image via iLO. ( You can also extract the image to local disk of target server if it’s Windows )
Boot from CD-ROM if you run via iLO.
I recommend you select Interactive Mode if that’s first time you do it for a particular hardware specification.
Go to review stage by following the wizard.
Make sure all hardware is listed on updating list.
Reboot after upgrading completed.

Note: If your blade firmware/driver is earlier than SPP2013.02 (include this version) you must upgrade VC to 4.01 or later, and then upgrade blades.

That’s the best practices what I’m using, please let me know if you have better idea.

How to configure nested Hyper-V VM on VMware Workstation

First, I would like to recommend DELL M4800 for small home lab, I spent lot of time to looking for a solution for my lab, I need some hardware low noise, low weight, and it is better portable. I checked out HP mini server, Apple MAC mini, Mac book pro…etc. They are nice products to show best design of IT industry, but no one perfect. Finally I choose DELL M4800. The reason is RAM can be upgraded to 32GB. 3 SSD hard disks is supported, you can install one native SSD, one in CD-ROM slot, another msata SSD in WAP slot. SSD disk is must have for IT LAB, it can provide you more than 10K IOPS without significant performance degrade.

Okay, back to topic…if you want to testing Windows Server 2012 R2 Hyper-V on lab, you probably prefer install it on a VM of VMware Workstation. You have to follow up proper steps to make sure the Hyper-V functionally.

After you create Hyper-V VM:

Keep VM power off status.
Go to Settings of VM.
Highlight Processors.
Select Virtualize Intel VT-x/EPT or AMD-V/RVI option.
Go to Options tab.
Highlight General.
Select Microsoft Windows.
Select Hyper-V on drop-list.
Power on.

Error 12711 VMM cannot complete the WMI operation on the server because of an error

Finally I implemented Hyper-V 2012 and SCVMM 2012 R2 on my lab, unfortunately FreeNAS does not supports SCSI-3 persistent reservation of Windows Server 2012 R2, you can refer bug #4003. It lead to my iSCSI storage cannot be brought online in Failover Cluster. I have to find out alternative.

I decided to use Windows Server File Server instead of iSCSI eventually. There are bunch of benefit to use that to leverage new SMB 3.0 technology. Key is it supports high available.

Followed the guide I successful created first shares for Hyper-V cluster, I created a testing VM but cannot power it on. It show me:

Error (12711)
VMM cannot complete the WMI operation on the server (dcahyv02.contoso.com) because of an error: [MSCluster_Resource.Name="SCVMM test (1)"] The cluster resource could not be brought online by the resource monitor.

The cluster resource could not be brought online by the resource monitor (0x139A)

Recommended Action
Resolve the issue and then try the operation again.

I went to cluster service manager on a Hyper-V host, event logs show me:

Cluster resource ‘SCVMM test (1)’ of type ‘Virtual Machine’ in clustered role ‘SCVMM test Resources (1)’ failed. The error code was ‘0x80004005’ (‘Unspecified error’).

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Initially I suspected that’s a problem of new file server, or SCVMM bug. But problem was it cannot be brought up even I created a VM on Hyper-V host directly. It gave me this error:

Virtual machine ‘test’ could not be started because the hypervisor is not running (Virtual machine ID AE786CAA-C74B-4F9E-8867-30191197087B). The following actions may help you resolve the problem: 1) Verify that the processor of the physical computer has a supported version of hardware-assisted virtualization. 2) Verify that hardware-assisted virtualization and hardware-assisted data execution protection are enabled in the BIOS of the physical computer. (If you edit the BIOS to enable either setting, you must turn off the power to the physical computer and then turn it back on. Resetting the physical computer is not sufficient.) 3) If you have made changes to the Boot Configuration Data store, review these changes to ensure that the hypervisor is configured to launch automatically.

Hypervisor is not running… it indicates something related to virtualization layer. I finally realized that I was running Hyper-V host on VMware Workstation on a laptop, it’s twice nested VM, something wrong!

I followed my article How to configure nested Hyper-V VM on VMware Workstation to fix this problem.

Windows cannot be installed on drive 0 partition 1

I think Windows Server 2012 will be next popular server OS just like Windows Server 2008, it’s also a nice hypervisor OS on virtual world. How do you think?

Installation is first step to experience the wonderful OS, you may see some strange problem during that step just like me. Today’s topic occurred long time ago, just want to share with people who may face similar issue like me.

That’s HP blade system with local disk attached, you may see similar problem on other vendor. When you select disk to install OS, installer may says Windows can’t be installed on drive 0 partition 1, or Windows cannot be installed on this disk. This computer’s hardware may not support booting to this disk. Ensure that the disk’s controllers is enabled in the computer’s BIOS menu.

That’s because boot volume is not set on array controller. For example by HP servers, you have to reboot and press F8 after BIOS checks array controller to enter array controller management interface. Then go to Select Boot Volume in main menu, select Direct Attached Storage, and then select the disk you want to install OS. Follow up the wizard to continue boot up.

If the problem persists, go to array controller management interface, rebuild array and select boot volume again, it should fix your problem.

Google AdSense available on my blog

About one month ago, I requested Google AdSense for my blog, I almost forgot that request due to the busy life. My friend Saju told me his IT blog has Google AdSense, that’s reminded me I have a pending AdSense. It was blank after I set it up in my blog, Today morning it’s finally show ADs…that’s not a relative of money, it’s just part of IT blog. lol

Still in memory, my first Google AdSense check was 10 years ago, I still remember it’s $200, my friend and me was so exciting when we known the check arrived China, that’s first time I made USD, probably first time saw how USD looks like. 🙂 Google AdSense…it brought back memories, it’s tough time for me in my life, but I still want to thanks my family, my friends and everyone who supported me.

时过境迁，那时候的事情在我的心里不再是仇恨和痛苦，这是我一生中的一小段经历、经验和做为一个男人应有的挫折。希望未来会更好。

Nodes in the ESXi cluster may report corruption after reboot host or attach device

VCE just released a new KB vce2563 to description the issue.

If your ESXi 5.x hosts is connected on VMAX running Enginuity 5876.159.102 and later, you may see this particular issue after reboot ESXi host or attach storage if you enabled block delete feature of VAAI.

To check the option status you can run following command on PowerCLI:

Get-VMHost -Location cluster name | Get-VMHostAdvancedConfiguration -Name VMFS3.EnableBlockDelete