Vlan ‘xxx’ resolved to unsupported VLAN ID in Cisco UCS Manager

You may need only 1 IP address for blade console in Cisco UCS Manager. You can follow Understanding “Management IP” of Cisco UCS Manager to configure it. You may see warning “Vlan ‘xxx’ resolved to unsupported VLAN ID” when you delete existing inbound and outbound IP pools if you are trying to clean up existing management IP pools.

That’s because inbound IP address for blade is not cleaned. You have to go to “Equipment” -> “Chassis” -> Target chassis -> “Servers” -> Target server -> Go to “Inventory” tab -> “CIMC” tab -> Click “Change Inbound Management IP” -> Remove existing VLAN and IP pool.

You will see inband IP tab is blank once it’s saved. Please note, the IP address reassign back after 1 minute if you clicked “Delete Inband Configuration” instead of “Change Inbound Managemnt IP“.

Advertisements

Understanding “Management IP” of Cisco UCS Manager

IP address for KVM in Cisco UCS Manager is different with HPE servers. It may assign multiple IP addresses to same blade if you don’t configure it properly. In my case each blade gets 3 IP addresses!

There are actually 3 types of IP address for KVM. (Cisco manual says 2):

  • Outbound Management IPs.
  • Inbound Management IPs for Blades.
  • Inbound Management IPs for Service Profiles.

Outbound Management IP” is default for KVM. Every new deployed blade will try to get a DHCP IP over management port in same VLAN of Cisco UCS Manager.

The more confused is the 2nd and 3rd IPs.  “Inbound Management IPs for Blades” is from “hardware” perspective. “Inbound Management IPs for Service Profiles” is from “logical” perspective.

If you go to “Equipment” -> Chassis -> blade -> Click the KVM to go console. You get console over either “Outbound Management IP” or “Inbound Management IPs for Blades“.

If you go to “Servers” -> “Service Profiles” -> Click the KVM of a service profile. You get console over either “Outbound Management IP” or “Inbound Management IPs for Service Profiles”.

If you want to configure just 1 IP for a blade whatever it’s for hardware or service profile. You need to do following:

  1. Delete the range of the default “ext-mgmt” in “IP Pools” of “LAN” node in Cisco UCS Manager.
  2. Create a new inbound IP pool and a VLAN group without uplink.
  3. Assign the VLAN and inbound IP pool to templates or service profile.

Refer to Setting the Management IP Address of Cisco UCS Manager manual for detail.

BTW, you may see Vlan ‘xxx’ resolved to unsupported VLAN ID in Cisco UCS Manager when you clean up existing IP pool and create new inbound pool.

Highlight Scripts in Microsoft OneNote 2016

I usually document my scripts in OneNote. It would be perfect if OneNote 2016 can highlight scripts. I found a nice plugin call “NoteHighlight2016” for OneNote 2016. It’s not only for 32 bit but also for 64 bit. You can download it in Github.

The default codes are C#, SQL, CSS, JS, HTML, XML, JAVA, PHP, Perl, Python, Ruby, and CPP. But you can change the settings to show more or less in riboon.xml in the installation folder.

“x/xx on FI-A is connected by a unknown server device” on Cisco UCS

You may see following errors in ‘info’ category of error messages in the Cisco UCS Manager after upgrading infrastructure firmware to 3.2.x.

“x/xx on FI-A is connected by a unknown server device”

This is bug documented in CSCvk76095. You have to reset the port on FI to fix it.

  1. Go to “Equipment” in Cisco UCS Manager.
  2. Go to “Fabric Interconnects” -> Go to the corresponding FI.
  3. Right-click the port x/xx -> Choose “Disable“.
  4. You will see multiple major faults. Wait for 5 seconds.
  5. Right click the port x/xx -> Choose “Enable“.
  6. All warnings disappeared after 5 mins. You may still see the warning in GUI due to cache. Relogin and check.

This change impacts to one link between IOM and the FI port. You need downtime if the IOM only has a single path. I don’t see any impact to ESXi blades in the pod.

Connect to New Provisioned Raspberry Pi Less than $3

The IP configuration of new provisoined Raspberry Pi struggled me a long time. I need to connect to a monitor so I login to system and configure IP address. The problem was I don’t have monitor. I only have a laptop.

Last year, my old laptop dead. I connected the laptop monitor to a HDMI board to my Raspberry Pi. It’s not a low cost solution, it costed me more than $10. And the monitor, cables and board looks uglily.

IMG_1844

Actually there is another solution to leveraging laptop keyboard and monitor. It’s serials port to console. Something similar like when you configure Cisco network switches. Following is how to do it. I achieve that on Raspberry Pi 2.

  1. You need to buy a USB to TTL device with chipset CP2102.
  2. Connect the pins to Raspberry Pi 2. Refer here for GPIO layout.
    TXD > Pi RXD Pin #10 (GPIO 16)
    RXD > Pi TXD Pin #08 (GPIO 15)
    GND > Pi GND Pin #6
  3. Connect the USB to laptop. You will see a device in ‘Device Manager’ needs drivers.
  4. Download driver and install.
  5. Download Putty and install.
  6. Open Putty and “Serial”.
  7. “Serial line” is COM3 or COM4.
  8. “Speed” is 115200.

The USB to TTL I bought on Taobao (Chinese version of Aliexpress). It’s around $1.2 including shipping.

Show CDP Neighbor of Cisco UCS Uplinks

There are two ways to know which network switch ports the network uplinks of Cisco UCS Fabric Interconnects are connected to.

By CLI

  • SSH to the Cisco UCS Manager.
  • Connect to FI-A.
# connect nxos a
  • Show neighbor of network uplinks.
# show cdp neighbor interface ethernet <port num>

By PowerShell

  • Make sure Cisco PowerTool (For UCS Manager) is installed.
  • Enabling the Information Policy via UCSM GUI.
    • Go to “Equipment” -> “Policies” tab -> “Global Policies” tab -> “Info Policy” area.
    • Change to “Enabled“. (No impact to running blades)
  • Open a PowerShell window.
  • Connect to the UCS Manager.
# Connect-Ucs <UCS FQDN>
  • Show CDP neighbor details.
# Get-UcsNetworkLanNeighborEntry

Side notes

Following command can shows network switch name, network switch ports and FI ports

# Get-UcsNetworkLanNeighborEntry | Select deviceid,remoteinterface,localinterface

If you prefer to enable the “Info Policy” by PowerShell, run following command

# Get-UcsTopInfoPolicy | Set-UcsTopInfoPolicy -State enabled -Force

“default Keyring’s certificate is invalid” in Cisco UCS Manager

You may see following error in Cisco UCS Manager:

default Keyring’s certificate is invalid

The reason is Admin -> Key Management -> KeyRing default is expired. It’s not possible to delete or change the KeyRing in GUI. You have to log in to SSH of Cisco UCS Manager and run following commands (The strings after “#”):

lab-B# scope security
lab-B /security # scope keyring default
lab-B /security/keyring # set regenerate yes
lab-B /security/keyring* # commit-buffer
lab-B /security/keyring #

This will result in a disconnect of the Cisco UCS Manager GUI on your client computer. Just refreshing the page after 5 seconds. It’s no impact to blades.

A Huge Amount of Warnings of “Image is Deleted” in Cisco UCS Manager

A few days ago, I deleted some older firmware packages in Cisco UCS Manager. Suddenly more than 100 warnings were generated. The error messages are similar below:

blade-controller image with vendor Cisco System Inc……is deleted

Cause: image-deleted

Clearly, it’s triggered due to packages deletion. But all of my service profiles and service profile templates were using existing firmware packages. The deleted packages were not been used anywhere.

I also deleted download tasks and cleaned up everything I can. The warnings still persisted. I figured out it’s caused by the default firmware policy when I read a blog article.

In case you are facing same issue. Please go to Servers -> Policies -> Host Firmware Packages -> default ->  Click Modify Package Versions -> Change it to available version.

 

Install LXC on CentOS 7 Minimal Version

Some notes for LXC. CentOS 7 minimal version doesn’t support LXC installation by default since LXC is deprecated in version 7. The new container solution is based on docker framework.

There is an alternative to install LXC. Following are procedures:

  1. Install Epel (Extra Packages for Enterprise Linux) repository.
    # yum install epel-release
  2. Install some dependencies.
    # yum install perl debootstrap libvirt
  3. Now you can install LXC in the epel repository.
    # yum install lxc lxc-template

Cannot Open KVM Virtual Machine Manager on CentOS 7

I got following error message when I try to run KVM Virtual Machine Manager: virt-manager on SSH.

Gtk-WARNING **: cannot open display:

There are several things need to be checked:

  • Make sure “X11Forwarding” is set to “yes” in /etc/ssh/sshd_config on the machine you run virt-manager.
    cat /etc/ssh/sshd_config | grep "^X11"
  • If you are using Windows to connecting SSH. The X11 need to be forwarded to an “X Window server” on top of Windows. I use xming.
  • If you connect SSH by Putty on Windows. Please configure X11 forwarding.
    • Go to “Connection” -> “SSH” -> “X11“.
    • Check “Enable X11 forwarding“.
    • Assign xming.exe path in “X authority file for local display“.
  • If you are using terminal on Mac OS. You need to install Xquartz. It configures terminal automatically.

Now you are ready to use “virt-manager“.

“Timed out waiting for the PowerShell extension to start” in Visual Studio Code

When you load a PowerShell script you may see following error messages:

Timed out waiting for the PowerShell extension to start

If you see error logs, following appears:

The language service could not be started

One possible reason is your PowerShell executive policy is set to “AllSigned“. You can find the policy by run PowerShell command below.

Get-ExecutionPolicy

Run the following command in an elevated PowerShell window to change the policy.

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned

 

Cisco UCS Blade Cannot Get IP Address for KVM

You may see “The IP address to reach the server is not set” when clicking the KVM console in Cisco UCS Manager. The issue persists even Cisco UCS Manager has enough IP addresses for management. Re-acknowledge or reset CIMC cannot fix the problem.

The fix procedure is go to “Equipment” -> Select the server -> “General” tab -> “Server Maintenance” -> “Decommission” the server.

Wait for the decommission completed, then re-acknowledge the server. IP address will be assigned to the server after the acknowledge process is completed.

How to Specific Allowed IP Addresses in ESXi Firewall by PowerCLI

In recent LAB environment reviewing, I noticed my LAB ESXi hosts allow connections from all IP address for NTP services. This is not the best practices for the solid environment. I want to specify certain IP addresses are allowed in case of vulnerabilities in NTP services. There are a lot of blogs talking about how to enable/disable firewall ruleset but no one talks about how to do so. Following is what I figured out. Please let me know if you see anything I can improve.

# Please connect to vCenter Server by Connect-ViServer before use this script.
$vmhosts = Get-VMHost -Location esxiCluster
foreach($vmhost in $vmhosts){
$esxcli=get-esxcli -vmhost $vmhost -V2
$ntpRuleSet = $esxcli.network.firewall.ruleset.set.CreateArgs()
$ntpRuleSet.allowedall="false"
$ntpRuleSet.rulesetid="ntpClient"
$esxcli.network.firewall.ruleset.set.Invoke($ntpRuleSet)
$ntpAllowIP = $esxcli.network.firewall.ruleset.allowedip.add.CreateArgs()
$ntpAllowIP.rulesetid="ntpClient"
$ntpAllowIP.ipaddress="192.168.0.1"
$esxcli.network.firewall.ruleset.allowedip.add.Invoke($ntpAllowIP)
$ntpAllowIP.ipaddress="192.168.0.1"
$esxcli.network.firewall.ruleset.allowedip.add.Invoke($ntpAllowIP)
}

The red text is customized parameters. Please change accordingly.

The script gets all ESXi hosts details in the specified location, you can use a cluster name, ESXi name, or folder. Then it disables “Allow connections from any IP address” option of the ruleset, and add 2 IP addresses to the ruleset.

ESXi Disconnects From vCenter

If you are still using Windows 2008 for vCenter Server. You may see ESXi hosts lost connection back and forth on vCenter Server after recent Windows pathing. It’s not something like a heartbeat lost for a few seconds. ESXi can takes minutes to back online.

You can see similar logs in vpxd.log:

2018-08-03T09:24:23.337-04:00 error vpxd[20160] [Originator@6876 sub=HttpConnectionPool-000000] [ConnectComplete] Connect failed to <cs p:00000000200ed300, TCP:XXXXXXXXXXXXXXXX:443>; cnx: (null), error: class Vmacore::SystemException(Only one usage of each socket address (protocol/network address/port) is normally permitted)

2018-08-03T09:24:23.337-04:00 error vpxd[06332] [Originator@6876 sub=Vmomi opID=HB-host-28@307067-1d257f9c] [VpxdClientAdapter] Got vmacore exception: Only one usage of each socket address (protocol/network address/port) is normally permitted

 

2018-08-03T09:24:23.338-04:00 error vpxd[06332] [Originator@6876 sub=Vmomi opID=HB-host-28@307067-1d257f9c] [VpxdClientAdapter] Backtrace:

–>

–> [backtrace begin] product: VMware VirtualCenter, version: 6.0.0, build: build-3634793, tag: vpxd

–> backtrace[00] vmacore.dll[0x001C599A]

–> backtrace[01] vmacore.dll[0x0005C8BF]

–> backtrace[02] vmacore.dll[0x0005DA0E]

That’s because your Windows server installed one of following patches.

July 10, 2018—KB4338818 (Monthly Rollup)

July 10, 2018—KB4338823 (Security-only update)

The fixes are the following.

If you installed KB4338818, please install July 18, 2018—KB4338821 (Preview of Monthly Rollup)

If you installed KB4338823, please install Improvements and fixes – Windows 7 Service Pack 1 and Windows Server 2008 R2 Service Pack 1 (KB4345459)

Basic Concepts: Linux Disks

Disk Interface

  • IDE (ATA): Bandwidth is 133 Mbps. IOPS is ~100. The interface can connect maximum 2 disks.
  • SCSI: IOPS is ~ 150. It can connects 8 or 16 disks
    • Ultrascsi320 – 320 MB/s
    • Ultrascsi640 – 640 MB/s
  • SATA: Bandwidth is 6 Gbps. IOPS is ~150. It can connects to 8 or 16 disks.
  • SAS: Bandwidth is 6 Gbps. IOPS is ~200. It can connects to 8 or 16 disks.
  • USB: Bandwidth is 480 MB/s. IOPS is vary.

Linux Disks

Disk (Device) Types

Block: Can be accessed randomly. Unit is “block”.

Character: Can be accessed sequentially. Unit is “character”.

Disk Files (FHS)

Files are under ‘/dev/’. Every disk (device) is file on Linux environment.

Device ID:

Major: Primary device ID. To identify device type for proper drivers.

Minor: Secondary device ID. It’s the entree of specific device of same type of device.

Create new device:

# mknode

[root@centos] mknode /dev/usbtest b 100 231 
[root@centos] ll | grep test 
brw-r--r--. 1 root root 100, 231 Jul 1 11:02 usbtest 

Disk Name

It’s assigned by ICANN.

IDE: /dev/hd[a-z]

SCSI/SATA/USB/SAS: /dev/sa[a-z][0-9]

There are 3 identifications below. You also can identify disks by go to /dev/disk/by-*.

  • Device file name
  • Volume labels
  • UUIDs

Disk Partition

MBR: Master Boot Record. It stars on sector 0. Size is 512 bytes. It consist of 3 parts:

  • Bootloader and boot applications. Size is 446 bytes.
  • Partition tables. Maximum 4 tables. 16 bytes each. Size is 64 bytes.
  • MBR availability mark. ’55AA’ means “active”. Size is 2 bytes.

The maximum ‘Primary‘ partition tables are 4. If you need more than 4 partitions, you need to create ‘Extend‘ partition as the last partition of the 4, then create ‘Logical partitions‘ on top of ‘Extend partition‘. Labels of the 4 are start from 1 to 4. ‘Logical partitions’ always starts from 5.

UI Hang on “Loading” on vRealize Operations Manager (vROPs)

Sometimes you may see vROPs web UI hang, keep on “Loading”, or no responding to any click. It because you are using vRealize Operations Manager 6.6 or earlier version, and your computer is touch screen.

There are two ways to fix that:

  1. Disable touch screen hardware in Device Manager. The device name usually is “HID-compliant touch screen” under “Human Interface Device
  2. If you are using Chrome, follow the steps below:
    1. Open Chrome and browse chrome://flags.
    2. Find “Touch Events API” and disable it.

Regular Expression

Regular Expression also calls Regexp. There are two classes regular expressions: Basic Regexp (BRE) and Extended Regexp (ERE). Regexp should be put in double quotation marks.

It can be categorized to:

Strings

. – Can be any character.

Sample: #cat /etc/fstab | grep "U..D"

It returns the lines contains U, D and whatever 2 characters between its.

[] – Any specified single character.

Sample: #cat /etc/fstab | grep "U[UAB]ID"

It returnes the lines contains U, I, D and U or A or B.

[^] – Any character except specified single character.

Sample: #cat /etc/fstab | grep "[^AB]ID"

It returnes the lines contains I and D but no A or B in left.

[[:digit:]] – All digital characters.

[[:lower:]] – All lower case characters.

[[:upper:]] – All upper case characters.

[[:alpha:]] – All alphabet characters.

[[:alnum:]] – All alphnumeric characters.

[[:space:]] – All space.

[[:print:]] – All visible characters and space.

[[:blank:]] – Space and tab.

[[:punct:]] – Punctuation and symbols.

Counting

 

Location

 

Grouping

 

 

Error 0x800f081f When Enable .Net 3.5 on Windows 10

When you install vSphere Client 5.x on Windows 10 computer. You may see “Enable .net 3.5 failed” message. And when you try to enable .Net 3.5 on Windows 10 manually, it shows error code 0x800f081f.

This issue occures on internet blocked or policy restricted computer. The only way to avoid that is use command line to specific .Net local path and force install it.

  1. Mount the Windows 10 ISO to your computer as a new drive.
  2. Copy the path of “xxx:\sources\sxs“.
  3. Run the following command.
    dism /online /enable-feature /featurename:netfx3 /all /limitaccess /source:xxx:\sources\sxs

UCS Manager UI Fonts Size on 4K Screen

Older UCS Manager uses Java application. The UI fonts could be extremely small on high DPI screen. The fix is:

  1. Go to “C:\Program Files (x86)\Java\jre1.8.0_171\bin“.
  2. Go to “Properties” of “jp2launcher.exe“.
  3. Compatibility” tab -> “Change high DPI settings“.
  4. Check “Override high DPI scaling behavior….“.
  5. Select “System (Enhanced)” or “System“.

 

在VMware Workstation上部署vCenter Server VCSA

网上有很多关于如何在VMware Workstation上部署vCenter Server VCSA的文章,但根据这些文章在部署过程中总是会遇到各种各样。以下是几点我总结出来的要点,仅供参考。

我假设你的实验环境里没有DNS或者域服务器,只是简单的使用VMware Workstation的DHCP服务,虚拟机的网卡选择的是“host-only”。以下步骤仅用于做一些快速测试时使用。

  1. vCenter Server安装好后第一次启动的时候会检测FQDN。如果你没有DNS服务器,FQDN检测会失败。所以在安装vCenter Server时要确保“Host Network Identity”输入的是IP地址。
  2. OVA文件导入后虚拟机会立刻自动启动,有时候虚拟机的网卡可能会是断开状态的。要确保网卡是连接状态。
  3. 第一次启动耗时大约15至20分钟,在没有完全启动完毕前虚拟机的控制台界面是不现实IP地址的。另外一个vCenter Server准备就绪的表现是IP地址ping得通了。
  4. vCenter Server第一次启动后,需要打开 https://vcenter_ip:5480 继续完成vCenter Server的配置。
  5. Administrator@vsphere.local 的密码就是你在OVA导入界面里输入的密码。

2018 5月28日更新:

在以上步骤的第四步中,你可能无法登陆root,提示验证失败。这是由于root锁死造成的,需要按照以下步骤解锁:

  1. 重启vCenter Server虚拟机。
  2. 在Photon启动界面按“e”键。
  3. 在第二行结尾加入”rw init=/bin/bash“。具体参考这里
  4. 当你看到#提示符时,运行命令”passwd“更改root密码。
  5. 运行命令”pam_tally2 –user root“检查root密码输入错了多少次。
  6. 如果输入错误次数大于1,运行命令 “pam_tally2 –user root –reset” 解锁root账号。
  7. 重启虚拟机,现在应该可以登录了。

2018 5月31日更新:

在以上步骤的第四步中,登陆后你应该会看到vCenter Server安装向导。如果你的vCenter Server只想用IP地址,请确保“System name”项填写的是IP地址。

Deploye vCenter Server Virtual Appliance on VMware Workstation

There are a lot of articles introduce how to deploy vCenter Server virtual appliance on VMware Workstation. I tried but somehow it’s failed. Following are some notes for your reference if you want to deploy vCenter Server virtual appliance on VMware Workstation real quick.

I assume you don’t have DNS or domain servers. Native DHCP services of VMware Workstation is used. You just want to use vCenter Server for some quick testings purpose, and “host-only” NIC you want to select.

  1. vCenter Server installer validates FQDN when it’s first boot up. The process fails if FQDN doesn’t work. So please make sure “Host Network Identity” is IP address of the VM when you set the OVA options.
  2. The VM is immediately booted up after importing the OVA file. But VM NIC is “disconnected” status sometimes. You have to enable the NIC in VM properties real quick.
  3. You have to wait for about 15 – 20 minutes after first boot. Console screen doesn’t show IP address before it’s fully ready. The indicator of readiness is the IP address of the VM is responding to ping.
  4. Login https://vcenter_ip:5480 to continue vCenter Server installation after the first boot is ready.
  5. The password of Administrator@vsphere.local is same as you set during importing the OVA.

Updates 28th May 2018:

Root authentication on step 4 above maybe failed. It’s caused by root account locking. Please follow the procedures below:

  1. Reboot vCenter VM.
  2. Press “e” when you see the Photon booting screen.
  3. Add “rw init=/bin/bash” to the end of the 2nd line. Refer here for detail.
  4. Run “passwd” to change root password when you see # prompt.
  5. Run “pam_tally2 –user root” to check how many failures root hits.
  6. Run “pam_tally2 –user root –reset” to unlock root if you see more than 1 in step 5.
  7. Reboot. You should be able to login root now.

Updates 31st May 2018:

You should see the installation wizard in step 4. Please make sure “System name” field is IP address if you only want to use IP for vCenter Server.

Updates 5th Sep 2018:

You may see the following error during installation.

Could not connect to VMware Directory Service via LDAP

It indicates vCenter Server FQDN doesn’t work. If you’re a home lab, you may want to add the DNS entries in the hosts file.

Troubleshooting Network Performance of Virtual Machine

There are several layers of networking on the virtualization infrastructure. Guest operating system, Virtual Machine, ESXi driver, physical network adapters, RJ45/SFP and network switches…etc. Sometimes it’s hard to say where exactly caused a problem. Especially hardware layer problems. Today I worked on a very interesting case, it may give some ideas to troubleshooting network performance issue which is caused by hardware layers.

A user told me he was bothered by network performance of a virtual machine. It’s slow to copy data to NFS share. But responding to “ping” command looked good. I didn’t see any issue on virtual machine layer. VMware Tools was up to date, Windows OS was patched, virtual network adapter type was VMXNET3 and VM version was also up to date.

When I tried to copy an image file to share folder of the virtual machine, I did see sometimes speed was fast, but sometimes not. Since I have two physical uplinks, it led me to guess it could be one of the uplinks.

After a lot of swapping and cable changing, we eventually figured out there was a bad SFP on network switch end. I was able to observe the issue by using “psping.exe” of Microsoft Sysinternals. I used the following command to send the different size of ping package to the virtual machine. Network drops were increasing when I increased package size.

psping.exe -l <size of package> <Destination>
Example: psping.exe -l 4k xxxx.contoso.com

The size could be 1k, 2m or even larger. I think this is a good way to identify problem outside of ESXi. Especially SFP problem as such kind of problem didn’t give any CRC or error count on network switch level.

You can also use Windows native command “ping.exe” as following. The size unit is “bytes”. For example, you need to input 4096 if you want to send 4kb.

ping.exe -l <size> <Destination>
Example: ping.exe -l 4096 xxx.contoso.com

 

 

IE 11 Window Doesn’t Change Between 4K Internal and Regular External Monitors

Just a quick notes. If you use multiple monitors, some are 4K and some are regular resolution, you may see window display issue when move Internet Explorer between these monitors. Follow the KB below to change register to allow Internet Explorer 11 accommodates the monitor solutions.

Internet Explorer 11 window display changes between a built-in device monitor and an external monitor

The older version of cis-upgrade-runner cannot be removed when upgrade vCenter Server 6.0

When you upgrade or patch vCenter Server 6.0 for Windows, you may see following symptoms:

“The older version of cis-upgrade-runner cannot be removed. Contact your technical support group.”

Or error code 1063:

“Installation of component VMware CIS upgrade runner failed with error code ‘1063’”

That means the vCenter Server installer cannot find MSI files of existing vCenter Server services. It could be following reasons:

  • You delete MSI files in “Temp” folder of the profile you used to install vCenter Server.
  • The account you used to login and install vCenter Server was roaming profile. The profile’s “Temp” folder was automatically deleted when you reboot/logoff the server.

vCenter Server 6.0 for Windows is consist of lot of standalone package. The upgrading process usually uninstall old packages, and then install newer packages. So the failure doesn’t impact to database or inventory data. You can re-initiate the upgrading again.

But you cannot manually uninstall old package since upgrading process brings down vCenter services first then uninstall old packages. If you already uninstalled old packages, the upgrading process will be stuck on bring down vCenter Services stage since some processes may already be removed. For example “vmware-python” it maps to “VMware vCenter Configuration Service”. If you manually uninstalled it before launch upgrading. It removes the service. Upgrading is not able to check status of the service.

Easiest way to get ride of this problem is

  1. Open Registry Editor (regedit) and go to the path: HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Installer\Products”
    You would see lot of keys are there.
  2. Search keyword “vmware-“. These keys store package info of vCenter Server.
  3. Expand one of the keys. Go to “SourceList”.
  4. The value of “LastUsedSource” is path of MSI file of old vCenter Server installer.
    For example my value is “m;1;X:\vcenter-server\packages\”.
  5. Make sure your server has the path mentioned in previous step (My case it’s X:\vcenter-server\packages\) and old MSI files are available in the path. If it’s a CD-ROM letter, you just need mount old vCenter Server image to the drive.
  6. Copy new vCenter Server image to a local folder, uncompresse and launch installer locally.
  7. Now the upgrading process can read original packages on the mentioned path in step 4. It will automatically remove old packages by the old MSI files.

There are two other workarounds. One is modify the value of “LastUsedSource” to reflect a new location of packages. But you still need the old MSI files be there. Another way is delete the key after you find it in step 2. (I never tested this way but it should work as it let vCenter Server installer thinks the server is brand new so installer can override the existing folders)

I also wrote another article for upgrading error on vCenter Server 5.5 for your reference:

CustomAction VM_InstallJRE returned actual error code 1624

 

CVE-2017-5754, CVE-2017-5753 and CVE-2017-5715 (Spectre and Meltdown)

You may know there are 3 vulnerabilities recently noticed by industry. Long story to short, kernel address space exposed to hackers when processors running user space code. It’s not only impact to Intel processors but also AMD and ARM. CVE-2017-5715 is a hardware issues that only apply certain firmware can fix the vulnerabilities. CVE-2017-5754 and CVE-2017-5753 need to apply OS patches to change how codes access kernel address space. Following are some useful links just for your reference.

CVE-2017-5753

CVE-2017-5715

CVE-2017-5754

VMware: https://www.vmware.com/security/advisories/VMSA-2018-0002.html (For CVE-2017-5753 and CVE-2017-5715. VMware has not published anything for CVE-2017-5754 yet.)

Microsoft: https://support.microsoft.com/en-gb/help/4072698/windows-server-guidance-to-protect-against-the-speculative-execution
https://support.microsoft.com/en-gb/help/4073119/protect-against-speculative-execution-side-channel-vulnerabilities-in

HPE: http://h22208.www2.hpe.com/eginfolib/securityalerts/SCAM/Side_Channel_Analysis_Method.html

Cisco: https://tools.cisco.com/security/center/content/CiscoSecurityAdvisory/cisco-sa-20180104-cpusidechannel

VMware Remote Console Freeze or Black Screen

The latest version of VMware Remote Console is 10.0.2. There should be some functions changing in the release. You may see following symptoms after upgrading to the version.

  1. The virtual machine console screen is black.
  2. Console screen works properly if you just hit buttons on VMware Remote Console window.
  3. The console screen freeze once VMware Remote Console grabs mouse or keyboard in screen.

The cause is your  anti-virus software blocked the internal functions of grab mouse/keyboard behavior of VMware Remote Console. Try disable anti-virus temporarily.

How To Migrate Parent Disk on Hyper-V 2012

If you are using Microsoft Hyper-V 2012 and “Differencing Disk” you may get trouble when you want to move whole VMs to another location due to “Parent Disk” migration is not so easy. Following is the steps to move parent disk on Hyper-V server.

Preparation

I assume you want to move bunch of virtual machines. First of all you need to get disk list of virtual machines. Following is a script to grab all parent and differencing disks on a Hyper-V server.

$VMs = Get-VM 
Foreach ($VM in $VMs)
{
  $VHDs = Get-VHD -Path $VM.harddrives.path
  foreach ($VHD in $VHDs)
  {
     [pscustomobject]@{
         Name = $VM.name
         VHDType = $VHD.VhdType
         VHD = $VHD.Path
         ParentVHD = $VHD.ParentPath
     }
  }
}

Save it to “Get-vhdParent.ps1”. Launch PowerShell by administrator right. Run following command to get parent disk table.

.\Get-vhdParent.ps1 | format-table -autosize

Now you have disk list in hand.

Move parent disks to new location

Parent disk moving is simple. Just copy the parent disk to new location. I suggest make multiple copies if you have large number of virtual machines linked to a parent disk. The reason is if the parent disk failed, at lease it’s not impact to all linked virtual machines. You can also distribute the duplicated parent disks to multiple location to avoid single location failure.

Re-configure parent disks for virtual machine

To be safe, I suggest grab parent disk information again by following command:

Get-VHD -Path VHDPath

Replace “VHDPath” with real differencing disk path of the virtual machine.

The output shows what’s the linked parent disk. Then run the command below to reconfigure parent disk to new location.

Set-VHD -Path VHDPath -ParentPath ParentVHDPath

You should get nothing return if it’s successfully.

If you manage Hyper-V virtual machines by System Center Virtual Machine Manager. The new parent disk is reflected after you right click the virtual machine and do a “Refresh” in System Center Virtual Machine Manager console.

 

 

 

Cannot Complete File Creation Operation When Storage vMotion

Just quick notes. I saw following error  when do storage vMotion.

Cannot Complete File Creation Operation.

When check /var/log/hostd.log. I saw following errors:

2017-11-28T02:51:04.476Z info hostd[76A80B70] [Originator@6876 sub=Vimsvc.TaskManager opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Task Created : haTask--vim.host.OperationCleanup
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] CopyFromEntry: Hostlog_Dump: Hostlog /vmfs/volumes/598700ee-ec
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] UUID: 28dbb1b5-a9d8-e311-1061-03300000002d
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] MigID: 1511837464286041
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] HLState: none
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] ToFrom: none
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] MigType: invalid
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] OpType: nfc
2017-11-28T02:51:04.476Z info hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] WorldID: 0
2017-11-28T02:51:04.478Z warning hostd[772C2B70] [Originator@6876 sub=Libs opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Hostlog_Flush: Failed to open hostlog /vmfs/volumes/598700e
2017-11-28T02:51:04.478Z warning hostd[772C2B70] [Originator@6876 sub=Vcsvc.OCM opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] PersistToDisk: failed to persist entry /vmfs/volumes/5
2017-11-28T02:51:04.478Z info hostd[772C2B70] [Originator@6876 sub=Default opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] AdapterServer caught exception: vim.fault.CannotCreateFile
2017-11-28T02:51:04.478Z info hostd[772C2B70] [Originator@6876 sub=Vimsvc.TaskManager opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Task Completed : haTask--vim.host.OperationClean
2017-11-28T02:51:04.478Z info hostd[772C2B70] [Originator@6876 sub=Solo.Vmomi opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Activation [N5Vmomi10ActivationE:0x75395c80] : Invoke do
2017-11-28T02:51:04.478Z verbose hostd[772C2B70] [Originator@6876 sub=Solo.Vmomi opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Arg entry:
--> (vim.host.OperationCleanupManager.OperationEntry) {
--> hlogFile = "/vmfs/volumes/598700ee-ec0f9918-5b56-000000000000/XXX-VM-01/XXX-VM-01-375f29ae.hlog",
--> opId = 1511837464286041,
--> opState = "running",
--> opActivity = "nfc",
--> curHostUuid = "28dbb1b5-a9d8-e311-1061-03300000002d",
--> }
2017-11-28T02:51:04.478Z info hostd[772C2B70] [Originator@6876 sub=Solo.Vmomi opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Throw vim.fault.CannotCreateFile
2017-11-28T02:51:04.478Z info hostd[772C2B70] [Originator@6876 sub=Solo.Vmomi opID=459515D4-000040D6-2d-cf-d4-7817 user=vpxuser:contoso\testuser] Result:
--> (vim.fault.CannotCreateFile) {
--> faultCause = (vmodl.MethodFault) null,
--> file = "/vmfs/volumes/598700ee-ec0f9918-5b56-000000000000/XXX-VM-01/XXX-VM-01-375f29ae.hlog",
--> msg = ""
--> }

It indicates there is a file cannot be created during migration. Further check on VM configuration file (.vmx) I noticed following parameter existing but the file doesn’t existing.

migrate.hostlog = "XXX-VM-01-375f29ae.hlog"

You cannot create the file directly. Workaround is create a .hlog file with other name then rename it to the same name.

BTW, there is a bug on ESXi 6.0 U1 for similar issue, but I saw this problem  on  U2. Just for your reference below.

Storage migration of a virtual machine with a name beginning with core fails with the error: Relocate virtual machine coreXX Cannot complete the operation because the file or folder coreXX-XXXXX.hlog already exists

Virtual Machine Cloning Is Failed At 33%

I got two exactly same hardware, installed same ESXi version. Somehow cloning from  other ESXi to one server was working, but another one always failed at 33%.

It only impacts to existing VM cloning but not impact to new created virtual  machines. I spent lot time on the troubleshooting. We replaced cables, switch ports, reinstalled ESXi.

There was no  abnormal logs except following:

2017-11-15T05:47:36.023Z [FFF001A0 verbose 'NfcManager' opID=F39DF7E6-00002211-da-c-bf] [NfcClient] Closing NFC connection to server

2017-11-15T05:47:36.023Z [FFF001A0 warning 'Libs' opID=F39DF7E6-00002211-da-c-bf] SSL: Unknown SSL Error

2017-11-15T05:47:36.023Z [FFF001A0 info 'Libs' opID=F39DF7E6-00002211-da-c-bf] SSL Error: error:1409E10F:SSL routines:SSL3_WRITE_BYTES:bad length

VMware support eventually provided me following KB to workaround the problem. Looks like it’s a bug on ESXi 5.5 2068190.

Disabling SSL for NFC data traffic in vCenter Server

Cloning or deploying from a template takes longer time after upgrading to VMware vSphere 5.1 Update 2 and 5.5

How To Get Used Space By PowerShell

I searched internet but hard to find an easy way to get used space on Windows Server.

Following is two lines PowerShell command to get used space on  Windows 2012 R2 Server.

Get-WmiObject win32_logicaldisk | select deviceid,@{n="Size";e={[math]::Round(($_.size/1GB),2)}},@{n="Used Space";e={[math]::Round((($_.Size-$_.FreeSpace)/1GB),2)}}

Could Not Complete Network Copy For File During VM Cloning

This error may only appears on legacy ESXi hosts. The cloning of virtual machine throws error at 33% of the task.

 Clone virtual machine
Could not complete network copy for file 
/vmfs/volumes/5xxxxxxe-4fb01111-3911-0ccxxxxxac38/TEST/TEST.vmdk
Copying Virtual Machine files

You may see following logs on vpxa.log of source ESXi host.

2017-11-02T02:51:20.238Z [FFF43B70 info 'Libs' opID=812BF517-00000667-f0-34-64] SSL: syscall error 32: Broken pipe
2017-11-02T02:51:20.238Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] [NFC ERROR] NfcNetTcpWrite: bWritten: -1
2017-11-02T02:51:20.238Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] [NFC ERROR] NfcFile_SendMessage: data send failed:
2017-11-02T02:51:20.238Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] [NFC ERROR] NFC_NETWORK_ERROR
2017-11-02T02:51:20.239Z [FFF43B70 error 'NfcManager' opID=812BF517-00000667-f0-34-64] [NfcClient] File transfer [/vmfs/volumes/5xxxxxxe-4fb01111-3911-0ccxxxxxac38/TEST/TEST.vmdk -> /vmfs/volumes/5xxxxxxe-4fb01111-3911-0ccxxxxxac38/TEST1/TEST1.vmdk] failed: The operation experienced a network error
2017-11-02T02:51:20.239Z [FFF43B70 verbose 'NfcManager' opID=812BF517-00000667-f0-34-64] [NfcClient] Closing NFC connection to server
2017-11-02T02:51:20.239Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] SSL: Unknown SSL Error
2017-11-02T02:51:20.239Z [FFF43B70 info 'Libs' opID=812BF517-00000667-f0-34-64] SSL Error: error:1409E10F:SSL routines:SSL3_WRITE_BYTES:bad length
2017-11-02T02:51:20.239Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] [NFC ERROR] NfcNetTcpWrite: bWritten: -1
2017-11-02T02:51:20.239Z [FFF43B70 warning 'Libs' opID=812BF517-00000667-f0-34-64] [NFC ERROR] NfcSendMessage: send failed: NFC_NETWORK_ERROR
2017-11-02T02:51:20.239Z [FFF43B70 error 'NfcManager' opID=812BF517-00000667-f0-34-64] [NfcWorker] Error encountered while processing copy spec for file [ds:///vmfs/volumes/5xxxxxxe-4fb01111-3911-0ccxxxxxac38/TEST/TEST.vmdk -> ds:///vmfs/volumes/5xxxxxxe-4fb01111-3911-0ccxxxxxac38/TEST1/TEST1.vmdk]:
--> vim.fault.NetworkCopyFault
2017-11-02T02:51:20.239Z [FFF43B70 error 'NfcManager' opID=812BF517-00000667-f0-34-64] [NfcManagerImpl] Copy operation failed with error: vim.fault.NetworkCopyFault

You may see following logs in vpxa.log of destination ESXi host.

2017-11-02T02:51:18.289Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] SSL: Unknown SSL Error
2017-11-02T02:51:18.289Z [304EEB70 info 'Libs' opID=task-internal-2164-739c5f01] SSL Error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
2017-11-02T02:51:18.289Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcNetTcpRead: bRead: -1
2017-11-02T02:51:18.289Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcNet_Recv: requested 262144, recevied only 16384 bytes
2017-11-02T02:51:18.289Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcFile_RecvMessage: data recv failed. retval = 3, expected 262144
2017-11-02T02:51:18.289Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcFile_ContinueReceive: failed to Recv message
2017-11-02T02:51:18.446Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcProcessStreamMsg: failed to receive file data
2017-11-02T02:51:18.446Z [304EEB70 warning 'Libs' opID=task-internal-2164-739c5f01] [NFC ERROR] NfcServerLoop: NfcServer_HandleRead returned an error : NFC_NETWORK_ERROR
2017-11-02T02:51:18.446Z [304EEB70 error 'provisioningvpxNfcServer' opID=task-internal-2164-739c5f01] [VPXNFCSERVER] Nfc server failed with return value : NFC_NETWORK_ERROR
2017-11-02T02:51:18.446Z [304EEB70 verbose 'provisioningvpxNfcServer' opID=task-internal-2164-739c5f01] [VPXNFCSERVER] Closing NFC session

It indicates the VM may be created on an older ESXi host or VMware Workstation. Somehow it imported to current ESXi host. Solution is create a new VM on the ESXi host and attach only virtual disks of the problematic VM.

Disable PXE Boot for Individual vNIC on Virtual Machine

To achieve Auto Deploy I’d like to control PXE boot process. I want the vNICs of management network can do PXE boot only. That’s because DHCP server may learns incorrect MAC address of management network if ESXi host boots up by non-management network NICs.

The psychical servers are easy to disable PXE boot feature of individual network adapters in BIOS or server profile. Virtual machines is tricky. Following is how to do it. It’s useful for home lab.

  1. Make sure your ESXi VM uses E1000E vNIC. You can only disable PXE boot for all vNICs in one time if type is vmxnet3. And nested ESXi doesn’t support E1000 vNICs.
  2. Go to the VM folder and edit vmx file.
  3. You should see similar entries below. The vNIC name starts from  ethernet0 to ethernetn. It matches vmnic0 to vmnicx on ESXi.

    ethernet1.virtualDev = “e1000e”

  4. I have 4 vNICs. I want to keep PXE boot for ethernet0 and ethernet1. So I only disable it on rest of vNICs. Add following lines to vmx file.

    ethernet2.opromsize = “0”
    ethernet3.opromsize = “0”

  5. Save the vmx file and quit.
  6. Power on the VM.

Please make sure you power on the virtual machines by vSphere Client or vSphere Web Client. As you may know VM console may opened by VMware Workstation if workstation and vSphere Client both existing on your computer. Looks like sometimes the parameter doesn’t work if you power on the  VM by VMware Workstation.

VMware KB Disabling Network boot option from appearing in a virtual machine’s BIOS (1014906) talks about same thing but the value of parameters looks like incorrect for ESXi 6.5.

How To Find Non-tagged ESXi Hosts

There are plenty of scripts to find tagged ESXi hosts. But what if you want to find out all ESXi hosts not be tagged? Following is a simple script:

Compare-Object ((Get-VMHost | Get-TagAssignment).Entity | select -uniq) (Get-VMHost)

The output is similar like following:

InputObject      SideIndicator
-----------      -------------
esx1         =>  
esx2         =>  
esx3         =>

The => indicates ESXi hosts in InputObject are not tagged.

If return is nothing, it means all ESXi hosts are tagged.

Please refer to “Using the Compare-Object Cmdlet” for detail.

Unable to Upgrade to Windows Server 2012 R2

I searched internet but there is no more information about this specific error message.

When you upgrade to Windows Server 2012 R2 or 2016. You may see following error message:

Windows won’t install unless each of these things is taken care of. Close Windows Setup, take care of each one, and then restart Windows Setup to continue.

Upgrades to this build have been disabled.

The reason is there is a hidden parameter in the image disabled uprading. You can only re-install by the image but cannot do upgrading. You have to ask vendor provide you a right copy, or buy Microsoft official image to do upgrading. I cannot publish the parameter due to legal reason.

Most of hardware vendors sale Windows copy along with new hardware. This kind of Windows calls OEM version. There are several different versions of Windows:

OEM SLP – This key comes pre-installed in Windows, when it comes from the Factory. This key is geared to work with the OEM Bios Flag found only on that Manufacturer’s computer hardware. So when Windows was installed using the OEM SLP key (at the factory) Windows looks at the motherboard and sees the proper OEM Bios Flag (for that Manufacturer and that version of Windows) and Self-Activates. (that’s why you did not need to Activate your computer after you brought it home)

OEM COA SLP – This is the Product key that you see on the sticker on the side (or bottom) of your computer. It is a valid product key, but should only be used in limited situations. The key must be activated by Phone. Usually you don’t have to input key during Windows installation since it check your hardware to get key.

OEM COA NSLP – Similar to OEM COA SLP license. Only different is you need to input the key during Windows installation. You must follow EULA to stay the copy on same computer forever.

Retail – Product keys are what the customer gets when he buys a Full Packaged Product (FPP), commonly known as a “boxed copy”, of Windows from a retail merchant or purchases Windows online from the Microsoft Store.

KMS Client and Volume MAK – They are issued by organizations for use on client computers associated in some way with the organization. Volume license keys may not be transferred with the computer if the computer changes ownership. Consult your organization or the Volume Licensing Service Center for help with volume license keys.

Hardware vendors may don’t allow you upgrade Windows in certain licensing mode. So they may provide you a newer Windows image to request you  do re-install on the computers but not upgrading.

Please refer following links for license key details.

What is the difference between SLP and NSLP versions of Windows 7?

Windows License Types Explained

HPE OEM Microsoft Windows Server FAQ Series- Part 1: Licensing Overview

HPE OEM Microsoft Windows Server FAQ Series- Part 2: OEM Licensing Basics

HPE OEM Microsoft Windows Server FAQ Series- Part 3: Microsoft Certificate of Authenticity (COA)

HPE OEM Microsoft Windows Server FAQ Series- Part 4: Windows Server 2016 Basics

HPE OEM Microsoft Windows Server FAQ Series- Part 5: Core-Based Licensing

HPE OEM Microsoft Windows Server FAQ Series- Part 6: Reseller Option Kit

HPE OEM Microsoft Windows Server FAQ Series- Part 7: Client Access Licenses (CALs)

HPE OEM Microsoft Windows Server FAQ Series- Part 8: OEM License Support

Network Problems of Auto Deployed ESXi Host in LAB

I built a simple Auto Deploy environment by vSphere 6.5 on nested environment. I created virtual ESXi hosts on a physical ESXi host to do the testing. The whole configuration was smoothly, I’m impressed Auto Deploy can be implemented in few hours. One thing bothered me was networking.

New ESXi hosts cannot get IP addresses properly somehow. It’s not a single problem. The symptoms are ESXi hosts cannot get IP address, or the Configure Management Network was grayed out on console, or ESXi hosts can get IP address but no responding to ping. Just quick post my solutions here.

To fix all these problems you need to do following:

  1. Enable Promiscuous Mode on the vSwtich which is attached to nested ESXi hosts on physical ESXi hosts.
  2. (I did that on Web Client of vCenter 6.5 U1. You may see different procedure on earlier versions.) Edit the host profile of Auto DeployNetworking configurationHost port group — Highlight Management Network — The option Determine how MAC address for vmknic should be decided — Choose Use the MAC Address from which the system was PXE booted.

If you don’t do step 1, your nested ESXi hosts may not able to get DHCP IP addresses properly, or it can get IP addresses but maps to a new MAC address lead to network packages cannot be transmitted.

Nested ESXi hosts get a DHCP IP addresses when do PXE booting. The hosts get another new IP addresses when apply host profile as soon as management network is created. It could be two different IP addresses and the MAC address of management network could be a new one that not same to any of vmnics. It will be hard to trace back on network switch in real environment, so I think it’s better also to do step 2.

Update 10/25/2017 — You should choose “User must explicitly choose the policy option” in step 2 above if you have multiple NICs. The reason is DHCP IP address during PXE may be captured by random NICs. If you choose what I mentioned in step 2, you will see DHCP server may learns MAC address of a none management network NICs associated with management IP address. Please refer this article for more detail.

Maximum Supported Boot Devices in Virtual Machine BIOS

Noticed a interesting limitation on VMware virtual machines. If you configure multiple SCSI controllers and distribute more than  8 virtual  disks. You may experience randomly OS boot up failure when power cycle VMs. Only last 8 disks with higher SCSI ID present in boot order settings of BIOS. You cannot choose the disks with lower SCSI ID.

You need to following up VMware KB “Changing the boot order of a virtual machine using vmx options (2011654)” to force virtual machines boot up on proper SCSI node.

Automatic vSphere Capacity Report in PPT

Reporting is important to management. To be a IT Pro, you may need to run regular reports for management. Some reports may be generated time consume. vRealize Operations Manager is an alternative to create customized reports. It’s a powerful product to organize data and create PDF or CSV files on scheduled intervals. I recommend have a look if you have planned to implement performance, capacity and alarm system for virtual environment.

What if budget is constrained? Is there a way to create such kind of reports? The answer is “Yes”. I worked out an automatic workflow to create the reports. I will not provide step-by-step guide in this post since it’s advanced integration of multiple products, everyone may have different way to do that. You can even create everything by script if you have strong programming skill. I’m not, I only look for the easiest way to achieve the goal.

Here is a scenario for  example: I want to run a monthly report for vSphere CPU and memory count and present to management by PowerPoint. I want to show management the historical trend of CPU and memory data. The traditional way is collect data in vCenter, organize and create charts in PowerPoint slides. So the whole workflow is: vCenter -> PowerPoint

If you want to automate the whole process you need to introduce few things more: PowerCLI, CSV and Excel. You need to develop a PowerCLI script to grab CPU and memory data on vCenter Server, then export the data to a CSV table by PowerShell command export-csv. Then import the table to an Excel file by Office feature Query Data. It loads the CSV table dynamically, you can even specific what data can be queried by filter.

Once the table is present in Excel, you need to create a chart accordingly. It’s trick when you paste the chart to PowerPoint Slide. You need to use Paste Special to paste the chart as Microsoft Excel Chart Object. The pasted chart can be updated automatically when you open the PowerPoint file.

The last step is created a scheduled task to run the PowerCLI script. Make sure you read my blog Extremely slow when run PowerShell script by scheduled tasks before create the task.

You can also configure the Excel file to automatically update table by CSV file.

Cannot Launch Patch Installer on Windows Server 2016

I was trying to update one Windows Server 2016 by standalone patch file. Somehow nothing happened after I double click the installer file. That’s because Windows Server 2016 prevent execute the  file due to it’s download from internet.

The quick fix is right click the file – Properties – Check Unblock – Click OK button.

Further more. The file has ADS (alternate data streams) attached. The ADS marked the file as download from internet.

You can run following two PowerShell commands to figure out object and value of the ADS.

PS C:\> Get-Item test file.msu -Stream *
PSPath : Microsoft.PowerShell.Core\FileSystem::C:\Users\wzheng110917a\Dow
 $DATA
PSParentPath : Microsoft.PowerShell.Core\FileSystem::C:\Users\wzheng110917a\Dow
PSChildName : 20171011_KB4038801_Updates.msu::$DATA
PSDrive : C
PSProvider : Microsoft.PowerShell.Core\FileSystem
PSIsContainer : False
FileName : C:\Users\wzheng110917a\Downloads\20171011_KB4038801_Updates.msu
Stream : :$DATA
Length : 1241376269

PSPath : Microsoft.PowerShell.Core\FileSystem::C:\Users\wzheng110917a\Dow
 one.Identifier
PSParentPath : Microsoft.PowerShell.Core\FileSystem::C:\Users\wzheng110917a\Dow
PSChildName : 20171011_KB4038801_Updates.msu:Zone.Identifier
PSDrive : C
PSProvider : Microsoft.PowerShell.Core\FileSystem
PSIsContainer : False
FileName : C:\Users\wzheng110917a\Downloads\20171011_KB4038801_Updates.msu
Stream : Zone.Identifier
Length : 26

 

PS C:\> Get-Content testfile.msu -Stream Zone.Identifier
[ZoneTransfer]
ZoneId=3

You can  see the ZoneId is 3. Following is a table to show which type of file it is.

0     My Computer 
1     Local Intranet Zone 
2     Trusted sites Zone 
3     Internet Zone 
4     Restricted Sites Zone

For more reference please read Microsoft blog “Alternate Data Streams in NTFS“.

You can use Unblock-File if you want to unblock multiple files.

 

博客是记录思想的地方

科技的发展真是非常快,十年前我还在用Windows Server 2003 和 Windows XP。十年后的今天,我们已经初尝到人工智能的味道。有那么多的Apps、网站、技术帮助我们更快速的学习新知识,人们的生活节奏越来越快,甚至学习这个人类的基本技能也在传统方式上增加了“碎片时间”方式(一个被“逻辑思维”所倡导的,2017年很流行的学习新方式)。

各种高科技的今天,我们可以用智能手机记录生活中的点点滴滴,甚至影像资料。但是“思想”,这个人类智力的核心却是无法记录的,唯文字可以反映作者当时的状态、情绪和记忆。在快节奏的城市生活中,我们有时候是需要慢下来、停下来的。回头看看自己的过去,读读当时的思想,回忆回忆曾经的记忆。很久前网络上开始流行一句话,大意是“别走得太快,等一等灵魂”(可能是假的名人名言)。我想也许文字可以做到这点。

这个国庆假期,利用空闲时间在网络上搜索搭建SS服务器的资料,无意间看到逗比根据地上的一篇文章介绍一个给互联网上所有网站做历史快照的网站,就随手搜了搜我的过去,竟然无意间发现了自己十几年前写的博客。这些文章早已被我遗留在互联网的某个角落,忘记了。看起来那应该是我从MSN Live空间搬过去的,还依稀记得2000年那会儿博客非常火爆,互联网企业都在推各自的免费博客服务,微软也不例外,但是后来好像因为这种服务不赚钱,以及监管原因,大量的博客服务开始关停,微软也不例外。幸好我当时把文章都转移了,今天才有机会帮我回忆起当时的状态。

非常高兴我可以重读当年的心力路程,让我再次如身临其境般的回到那个时代、回到那个状态。我会尝试花一些时间记录生活,给我的未来留下些参考。

Memory Errors on Modern Servers

I used to see memory degrading on  Cisco  UCS blades. But less see on HPE blades. I thought it maybe quality control problem of Cisco manufacture. Today I read two articles in Cisco website, it explains why we see memory degrading and how it works. I attached the articles below.

Managing Correctable Memory Errors on Cisco UCS Servers

UCS Enhanced Memory Error Management

The conduction in the whitepaper is not only specific for Cisco UCS, but also for any modern servers. Following is summary of why memory errors rates is going high nowadays.

  • Larger memory systems contain more bits
  • Higher capacity DRAM chips require smaller bit cells which result in fewer stored charges per bit
  • Lower operating voltages can lead to reduced noise margin
  • Higher operating speeds can lead to reduced timing margin

Oracle Utilizes 50% of Physical Processors on HPE Server

DBA team told me Oracle was running slow on a HPE server. I observed the CPU utilization was about 50% of overall capacity. Whenever Oracle database bumps up the system experienced slowness.

Further  digged into the issue, I see Oracle workload only ran on single physical processor, but the server has two processors. And the  Windows 2012 R2 resource manager show the system used Processor Group, the two physical processors were grouped out. This technology is described in Microsoft MSDN article.

To fix the issue you have to change value of “NUMA Group Size Optimization” to “Flat” in BIOS. Please refer to HPE article for detail  steps.

Detail of HPE server behavior  is documented here. Please note, the article says it impacts to ProLiant Gen9 and Intel E5-26xx v3 processors. But it actually also impacts to Intel E5-26xx v4 and Synergy blades.

虚拟主机无法获得DHCP IP地址

刚解决了一个问题,快速更新一下。当虚拟主机无法获得DHCP IP地址时,应该做的第一件事情是检查防火墙,无论是Windows防火墙或者物理防火墙。UDP端口67和68不能被阻挡掉。否则会出现虚拟主机只能获得169.x.x.x的IP地址,这个地址是不可用的,表示虚拟主机无法从DHCP服务器获得地址。

这两个端口是DHCP客户端用来从DHCP服务器获取IP地址的。具体的技术细节可以参考RFC文档

DHCP uses UDP as its transport protocol. DHCP messages from a client

to a server are sent to the ‘DHCP server’ port (67), and DHCP

messages from a server to a client are sent to the ‘DHCP client’ port

(68). A server with multiple network address (e.g., a multi-homed

host) MAY use any of its network addresses in outgoing DHCP messages.

我在排错过程中也用到了这篇文章。

搜索虚拟主机时提示:Login To The Query Service Failed

使用vSphere Client登录vCenter Server 6.0时可能会出现如下报错信息:

Login to the query service failed.

The server could not interpret the communication from the client. (The remote server returned an error: (500) Internal Server Error.)

这是因为在登录vSphere Client时勾选了”Use Windows session credentials“。试试取消它。

相关知识库链接:Searching the Inventory with the vSphere Client fails (2143566)

Login To The Query Service Failed When Search Virtual Machine

You may see following problem if you login vCenter  Server 6.0 by vSphere Client:

Login to the query service failed.

The server could not interpret the communication from the client. (The remote server returned an error: (500) Internal Server Error.)

That’s because “Use Windows session credentials” checkbox is selected. Deselect it and give it a try.

Refer KB Searching the Inventory with the vSphere Client fails (2143566)

Virtual Machine Cannot Get DHCP IP Address

Just a quick post. When virtual machine cannot get DHCP IP address the first thing you want to check is firewall. Whatever Windows firewall or physical firewall. You should make sure UDP port 67 and 68 are not blocked. Otherwise you  will see the virtual machine gets 169.x.x.x IP address only.

The two  ports is required for DHCP client to query IP addresses. The methodology is introduced in RFC document.

DHCP uses UDP as its transport protocol. DHCP messages from a client
to a server are sent to the ‘DHCP server’ port (67), and DHCP
messages from a server to a client are sent to the ‘DHCP client’ port
(68). A server with multiple network address (e.g., a multi-homed
host) MAY use any of its network addresses in outgoing DHCP messages.

I also got some ideas in this post.

Adobe Flash Player Out of Date on vSphere Web Client

You may see ‘Adobe Flash Player Out of Date’ on Chrome when you open vSphere Web Client. Click the text Chrome will update Flash Player automatically. But in some cases it doesn’t work due to maybe your Chrome is controlled by company policy or internet problem to Adobe.com. I found an article to show how to offline fix the issue. You can download Flash Player for Opera and Chromium-based browsers – PPAPI in official Adobe KB article.

You may also want to check out my other articles about Flash issue on browsers.

Flash menu appears when right click on vSphere Web Client in Chrome

Cannot open vSphere Web Client on IE11 on Windows 8.1