Category: English

English version of my posts.

  • Get specific advanced configuration of ESXi host

    Storage team said the best practics of QFullSampleSize is 32, they want to check how it’s going in our environment. It’s easy to check individual host, but pretty time consuming if you want to check 300+ hosts.
    Here is a one line PowerShell script to export QFullSampleSize and QFullThreshold to a csv file.

    Get-VMHost | %{ $HostName=$_.Name; $HostCluster=$_.Parent; Get-VMHostAdvancedConfiguration -VMHost $_ | % { $_.getEnumerator()| ? {$_.Key -like "*QFull*"} | select Name,Value,@{N='host';E={$HostName}},@{N='Cluster';E={$HostCluster}} } } | export-csv c:qSetting.csv

     

     

     

  • The number of heartbeat datastores for host is 0, which is less than required: 2

    Today I see this error message on one ESXi5.0 host:

    The number of heartbeat datastores for host is 0, which is less than required: 2

    No any VM is running on the host by DRS or HA, VMware KB gives a solution but too complicate.

    Re-configure HA can fixes the problem.

    Right click the host -> Click Reconfigure for vSphere HA -> Waiting HA configuration complete.

  • No permission to login to vCenter Server 5.1

    Today, we P2V one vCenter Server, I re-added identify source for some reason, I didn’t modified any existing domain group and ACL.
    After a while I got a interesting case. User reported they got “No permission to login to vCenter Server 5.1 by vSphere Client”.
    I looked into the vpxa.log of vCenter Server, it show that:

    2013-05-01T11:08:01.399-05:00 [09108 error '[SSO]' opID=6e704a51] [UserDirectorySso] AcquireToken InvalidCredentialsException: Authentication failed: Authentication failed
    
    2013-05-01T11:08:01.399-05:00 [08644 error 'authvpxdUser' opID=5469f71e] Failed to authenticate user <xxxx>

    I was not 100% sure that log related to the real problem. but that’s indicated it should be something related to authentication components.
    After compared working SSO with the fault SSO, I noticed Domain Alias was blank on fault SSO:

    Idenfity source

    Then I added a domain group on fault vCenter Server and compared the group with working vCenter Server, it’s shows format different, just like that:
    Working SSO – CONTOSOTEST-GROUP
    Fault SSO – CONTOSO.COMTEST-GROUP

    Okay…now I know why user logging got fault. The identify source configured Domain Alias before I removed it on fault SSO, then I added identify source without Domain Alias, and thenvCenter Server used Domain name as default prefix of domain group, it lead to original domain groups format ( CONTOSOxxxx ) cannot be identified by SSO.

    So I deleted the identify source and added a same source with Domain alias, problem fixed…

  • How to retrieve or set Path Selection Policy by vCLI

    First of all, this article is nothing related to PowerCLI. 🙂

    You probably know how to set Path Selection Policy (PSP) by vSphere Client, but how you can setup 100 LUNs manually? We have some script can make your life easy.

    How to retrieve LUN Path Selection Policy:

    esxcli storage nmp device list | egrep “Device Display Name|Path Selection Policy:”


    You will get a output like that:

    Device Display Name: DGC Fibre Channel Disk (naa.600601602a102e0002cdf2a2596be211)
    Path Selection Policy: VMW_PSP_RR


    This script help you identify which LUN is what type of policy. Here tell you what is Path Selection Policy.

    Next, let’s see how to modify these LUN PSP by script:
    First, you should run following script to print out command for each LUN, don’t forget change the bold text to the PSP you prefer.

    esxcli storage nmp device list | awk '/^naa/{print "esxcli storage nmp device set -d "$0" -PVMW_PSP_RR" };'


    Then, copy the output to notepad and remove the local disk, for example following bold NAA indicates the LUN is a local HP disk.

    esxcli storage nmp device set -d naa.600601602a102e008896dda81b88e211 -P VMW_PSP_RR
    esxcli storage nmp device set -d naa.600601602a102e008861b28a596be211 -P VMW_PSP_RR
    esxcli storage nmp device set -d naa.600601602a102e00560d8488b456e211 -P VMW_PSP_RR
    esxcli storage nmp device set -d naa.600601602a102e00c4cd2600b456e211 -P VMW_PSP_RR
    esxcli storage nmp device set -d naa.600508b1001c1e987243838af4c67891 -P VMW_PSP_RR
    esxcli storage nmp device set -d naa.600601602a102e008c96dda81b88e211 -P VMW_PSP_RR


    Last, copy modified text back to putty session, it will run the commands one by one.

  • How to retrieve RDM information by PowerCLI

    I worked on move RDM LUNs of Microsoft Cluster virtual machine from one iGroup to another. To make sure the moving safe, we should record RDM LUN information before migration.

    We had two VMs with almost 20 RDM LUNs, it’s pretty time consume to get the information manually, I used following script to retrieve information:

    $RMDinfo = Get-HardDisk -VM virtual machine name -DiskType rawPhysical
    
    $RDMinfo | select Parent,Filename,CapacityGB,ScsiCanonicalName,Name

     

  • Port Groups not Work with VLAN Tag on Cisco Switch

    Few weeks ago, I tried to standardize networking of a cluster, there were 4 VLANs for production virtual machines, I binded the VLANs on one virtual switch which had 4 physical vmnic.

    Then I created 4 port groups with different VLAN ID, but for some reason virtual machines unreachable via some vmnics. Network team verified port channel was good.

    I tried on several ESXi 5.0 hosts in the cluster, all had same problem, finally we found that’s a Cisco switch bug….you could find detail information and work around here.

  • HP patching error after upgrade to Update Manager 5.1

    If you installed “HP ESXi 5.0 Complete Bundle Update 1.6” via Update Manager 5.0, you would be able to see storage and power sub-system shows warning on HP server, that’s because some parameters show NULL in updated HP SIM provider.

    Example:
    
    HPVC_SAController.Name="vmwControllerHPSA1",CreationClassName="HPVC_SAController"
     CreationClassName = HPVC_SAController
     Name = vmwControllerHPSA1
     PowerManagementCapabilities = (NULL)
     ResetCapability = (NULL)
     OtherDedicatedDescriptions = (NULL)
     Dedicated = (NULL)
     NameFormat = (NULL)
     TransitioningToState = 12
     AvailableRequestedStates = (NULL)
     TimeOfLastStateChange = (NULL)
     EnabledDefault = 2
     RequestedState = 12

    I think HP has called back the bundle, you may see similar error message below if you already download the patch and upgrade to Update Manager 5.1 then.

    VMware vSphere Update Manager had an unknown error. Check the events and log files for details.

    After upgrade to Update Manager 5.1
    Cannot download software packages from patch source. Check the events and the Update Manager log for download details.
    
    After remove "data" folder in Update Manager 5.1
    No way to avoid the error message except filter your baseline to exclude HP patches.

    Another blogger also described same situation here.

  • Unknown status of Hardware Acceleration

    When I read VMware documents, there is a cool feature Hardware Acceleration I found in storage book. That recall me an outage about one year ago, our NetApp filer was crashed due to motherboard problem, part of datastores was failed, we have to move virtual machine from the filer to other. We noticed the storage vMotion performance was pretty high, the data moving speed was 2 times less than regular storage vMotion. That’s the advantage of Hardware Acceleration.

    The first thing of this year is standardize the virtualization environment. I found an interesting problem when I checked the Hardware Acceleration part, same luns show different status on different ESXi 5 host of a cluster, some of the hosts show Hardware Acceleration enabled, and some show Unknown.

    The storage is EMC Clarion CX series with ALUA enabled, I found working hosts attached VAAI filter, non-working hosts had nothing.

    Working Host

    Figure 1   Working Host

    Non-working Host

    Figure 2   Non-working Host

    ESXi 5 automatic attach different filter according to lun properties, that issue indicates the lun properties was different on different ESXi 5 host, that’s a storage layer issue, after troubleshooting with EMC, we found Failover Mode of luns was different on each host, the Failover Mode should be 4 instead of default 1.

    Please be aware of that storage activity on particular host will interrupt when you change Failover Mode, please put the host in maintenance mode first.

    Regarding Failover Mode, I had discussion with a storage engineer, he told me different storage vendor have different name for “Failover Mode”, some storage vendor may request choose OS type of target machine. For EMC, there are 5 modes, please refer to page 10 on EMC document

  • How to remove multiple snapshot by PowerCLI

    My SMVI backup job was crashed few days ago, the stupid application generated a lot of snapshots for virtual machine!!! It’s  hundred!

    I really don’t like to remove one by one! That’s what I used to clean up the snapshot.

    Get-VM | Get-Snapshot -Name smvi* | Remove-Snapshot

    I used wildcard smiv*, it means all snapshot that name start with smvi.

  • Unable to find new lun when you try to extend vmfs datastore

    You probably see this rare problem: your storage team allocate new lun to esxi 5.0 host, lun is visible in add new storage screen, but invisible in extend datastore  screen.

    Add new storage screen:

    Add storage screen

    Increase datastore capacity:

    Increase datastore capacity screen

     

    That’s because the datastore, lun is connected to multiple esxi / esx host which have different version, please be sure storage is connected to same version of esxi / esx host.