ESXi Disconnects From vCenter

If you are still using Windows 2008 for vCenter Server. You may see ESXi hosts lost connection back and forth on vCenter Server after recent Windows pathing. It’s not something like a heartbeat lost for a few seconds. ESXi can takes minutes to back online.

You can see similar logs in vpxd.log:

2018-08-03T09:24:23.337-04:00 error vpxd[20160] [Originator@6876 sub=HttpConnectionPool-000000] [ConnectComplete] Connect failed to <cs p:00000000200ed300, TCP:XXXXXXXXXXXXXXXX:443>; cnx: (null), error: class Vmacore::SystemException(Only one usage of each socket address (protocol/network address/port) is normally permitted)

2018-08-03T09:24:23.337-04:00 error vpxd[06332] [Originator@6876 sub=Vmomi opID=HB-host-28@307067-1d257f9c] [VpxdClientAdapter] Got vmacore exception: Only one usage of each socket address (protocol/network address/port) is normally permitted

 

2018-08-03T09:24:23.338-04:00 error vpxd[06332] [Originator@6876 sub=Vmomi opID=HB-host-28@307067-1d257f9c] [VpxdClientAdapter] Backtrace:

–>

–> [backtrace begin] product: VMware VirtualCenter, version: 6.0.0, build: build-3634793, tag: vpxd

–> backtrace[00] vmacore.dll[0x001C599A]

–> backtrace[01] vmacore.dll[0x0005C8BF]

–> backtrace[02] vmacore.dll[0x0005DA0E]

That’s because your Windows server installed one of following patches.

July 10, 2018—KB4338818 (Monthly Rollup)

July 10, 2018—KB4338823 (Security-only update)

The fixes are the following.

If you installed KB4338818, please install July 18, 2018—KB4338821 (Preview of Monthly Rollup)

If you installed KB4338823, please install Improvements and fixes – Windows 7 Service Pack 1 and Windows Server 2008 R2 Service Pack 1 (KB4345459)

Basic Concepts: Linux Disks

Disk Interface

  • IDE (ATA): Bandwidth is 133 Mbps. IOPS is ~100. The interface can connect maximum 2 disks.
  • SCSI: IOPS is ~ 150. It can connects 8 or 16 disks
    • Ultrascsi320 – 320 MB/s
    • Ultrascsi640 – 640 MB/s
  • SATA: Bandwidth is 6 Gbps. IOPS is ~150. It can connects to 8 or 16 disks.
  • SAS: Bandwidth is 6 Gbps. IOPS is ~200. It can connects to 8 or 16 disks.
  • USB: Bandwidth is 480 MB/s. IOPS is vary.

Linux Disks

Disk (Device) Types

Block: Can be accessed randomly. Unit is “block”.

Character: Can be accessed sequentially. Unit is “character”.

Disk Files (FHS)

Files are under ‘/dev/’. Every disk (device) is file on Linux environment.

Device ID:

Major: Primary device ID. To identify device type for proper drivers.

Minor: Secondary device ID. It’s the entree of specific device of same type of device.

Create new device:

# mknode

[root@centos] mknode /dev/usbtest b 100 231 
[root@centos] ll | grep test 
brw-r--r--. 1 root root 100, 231 Jul 1 11:02 usbtest 

Disk Name

It’s assigned by ICANN.

IDE: /dev/hd[a-z]

SCSI/SATA/USB/SAS: /dev/sa[a-z][0-9]

There are 3 identifications below. You also can identify disks by go to /dev/disk/by-*.

  • Device file name
  • Volume labels
  • UUIDs

Disk Partition

MBR: Master Boot Record. It stars on sector 0. Size is 512 bytes. It consist of 3 parts:

  • Bootloader and boot applications. Size is 446 bytes.
  • Partition tables. Maximum 4 tables. 16 bytes each. Size is 64 bytes.
  • MBR availability mark. ’55AA’ means “active”. Size is 2 bytes.

The maximum ‘Primary‘ partition tables are 4. If you need more than 4 partitions, you need to create ‘Extend‘ partition as the last partition of the 4, then create ‘Logical partitions‘ on top of ‘Extend partition‘. Labels of the 4 are start from 1 to 4. ‘Logical partitions’ always starts from 5.

UI Hang on “Loading” on vRealize Operations Manager (vROPs)

Sometimes you may see vROPs web UI hang, keep on “Loading”, or no responding to any click. It because you are using vRealize Operations Manager 6.6 or earlier version, and your computer is touch screen.

There are two ways to fix that:

  1. Disable touch screen hardware in Device Manager. The device name usually is “HID-compliant touch screen” under “Human Interface Device
  2. If you are using Chrome, follow the steps below:
    1. Open Chrome and browse chrome://flags.
    2. Find “Touch Events API” and disable it.

Regular Expression

Regular Expression also calls Regexp. There are two classes regular expressions: Basic Regexp (BRE) and Extended Regexp (ERE). Regexp should be put in double quotation marks.

It can be categorized to:

Strings

. – Can be any character.

Sample: #cat /etc/fstab | grep "U..D"

It returns the lines contains U, D and whatever 2 characters between its.

[] – Any specified single character.

Sample: #cat /etc/fstab | grep "U[UAB]ID"

It returnes the lines contains U, I, D and U or A or B.

[^] – Any character except specified single character.

Sample: #cat /etc/fstab | grep "[^AB]ID"

It returnes the lines contains I and D but no A or B in left.

[[:digit:]] – All digital characters.

[[:lower:]] – All lower case characters.

[[:upper:]] – All upper case characters.

[[:alpha:]] – All alphabet characters.

[[:alnum:]] – All alphnumeric characters.

[[:space:]] – All space.

[[:print:]] – All visible characters and space.

[[:blank:]] – Space and tab.

[[:punct:]] – Punctuation and symbols.

Counting

 

Location

 

Grouping

 

 

Error 0x800f081f When Enable .Net 3.5 on Windows 10

When you install vSphere Client 5.x on Windows 10 computer. You may see “Enable .net 3.5 failed” message. And when you try to enable .Net 3.5 on Windows 10 manually, it shows error code 0x800f081f.

This issue occures on internet blocked or policy restricted computer. The only way to avoid that is use command line to specific .Net local path and force install it.

  1. Mount the Windows 10 ISO to your computer as a new drive.
  2. Copy the path of “xxx:sourcessxs“.
  3. Run the following command.
    dism /online /enable-feature /featurename:netfx3 /all /limitaccess /source:xxx:sourcessxs

UCS Manager UI Fonts Size on 4K Screen

Older UCS Manager uses Java application. The UI fonts could be extremely small on high DPI screen. The fix is:

  1. Go to “C:Program Files (x86)Javajre1.8.0_171bin“.
  2. Go to “Properties” of “jp2launcher.exe“.
  3. Compatibility” tab -> “Change high DPI settings“.
  4. Check “Override high DPI scaling behavior….“.
  5. Select “System (Enhanced)” or “System“.

 

在VMware Workstation上部署vCenter Server VCSA

网上有很多关于如何在VMware Workstation上部署vCenter Server VCSA的文章,但根据这些文章在部署过程中总是会遇到各种各样。以下是几点我总结出来的要点,仅供参考。

我假设你的实验环境里没有DNS或者域服务器,只是简单的使用VMware Workstation的DHCP服务,虚拟机的网卡选择的是“host-only”。以下步骤仅用于做一些快速测试时使用。

  1. vCenter Server安装好后第一次启动的时候会检测FQDN。如果你没有DNS服务器,FQDN检测会失败。所以在安装vCenter Server时要确保“Host Network Identity”输入的是IP地址。
  2. OVA文件导入后虚拟机会立刻自动启动,有时候虚拟机的网卡可能会是断开状态的。要确保网卡是连接状态。
  3. 第一次启动耗时大约15至20分钟,在没有完全启动完毕前虚拟机的控制台界面是不现实IP地址的。另外一个vCenter Server准备就绪的表现是IP地址ping得通了。
  4. vCenter Server第一次启动后,需要打开 https://vcenter_ip:5480 继续完成vCenter Server的配置。
  5. Administrator@vsphere.local 的密码就是你在OVA导入界面里输入的密码。

2018 5月28日更新:

在以上步骤的第四步中,你可能无法登陆root,提示验证失败。这是由于root锁死造成的,需要按照以下步骤解锁:

  1. 重启vCenter Server虚拟机。
  2. 在Photon启动界面按“e”键。
  3. 在第二行结尾加入”rw init=/bin/bash“。具体参考这里
  4. 当你看到#提示符时,运行命令”passwd“更改root密码。
  5. 运行命令”pam_tally2 –user root“检查root密码输入错了多少次。
  6. 如果输入错误次数大于1,运行命令 “pam_tally2 –user root –reset” 解锁root账号。
  7. 重启虚拟机,现在应该可以登录了。

2018 5月31日更新:

在以上步骤的第四步中,登陆后你应该会看到vCenter Server安装向导。如果你的vCenter Server只想用IP地址,请确保“System name”项填写的是IP地址。

Deploye vCenter Server Virtual Appliance on VMware Workstation

There are a lot of articles introduce how to deploy vCenter Server virtual appliance on VMware Workstation. I tried but somehow it’s failed. Following are some notes for your reference if you want to deploy vCenter Server virtual appliance on VMware Workstation real quick.

I assume you don’t have DNS or domain servers. Native DHCP services of VMware Workstation is used. You just want to use vCenter Server for some quick testings purpose, and “host-only” NIC you want to select.

  1. vCenter Server installer validates FQDN when it’s first boot up. The process fails if FQDN doesn’t work. So please make sure “Host Network Identity” is IP address of the VM when you set the OVA options.
  2. The VM is immediately booted up after importing the OVA file. But VM NIC is “disconnected” status sometimes. You have to enable the NIC in VM properties real quick.
  3. You have to wait for about 15 – 20 minutes after first boot. Console screen doesn’t show IP address before it’s fully ready. The indicator of readiness is the IP address of the VM is responding to ping.
  4. Login https://vcenter_ip:5480 to continue vCenter Server installation after the first boot is ready.
  5. The password of Administrator@vsphere.local is same as you set during importing the OVA.

Updates 28th May 2018:

Root authentication on step 4 above maybe failed. It’s caused by root account locking. Please follow the procedures below:

  1. Reboot vCenter VM.
  2. Press “e” when you see the Photon booting screen.
  3. Add “rw init=/bin/bash” to the end of the 2nd line. Refer here for detail.
  4. Run “passwd” to change root password when you see # prompt.
  5. Run “pam_tally2 –user root” to check how many failures root hits.
  6. Run “pam_tally2 –user root –reset” to unlock root if you see more than 1 in step 5.
  7. Reboot. You should be able to login root now.

Updates 31st May 2018:

You should see the installation wizard in step 4. Please make sure “System name” field is IP address if you only want to use IP for vCenter Server.

Updates 5th Sep 2018:

You may see the following error during installation.

Could not connect to VMware Directory Service via LDAP

It indicates vCenter Server FQDN doesn’t work. If you’re a home lab, you may want to add the DNS entries in the hosts file.

Troubleshooting Network Performance of Virtual Machine

There are several layers of networking on the virtualization infrastructure. Guest operating system, Virtual Machine, ESXi driver, physical network adapters, RJ45/SFP and network switches…etc. Sometimes it’s hard to say where exactly caused a problem. Especially hardware layer problems. Today I worked on a very interesting case, it may give some ideas to troubleshooting network performance issue which is caused by hardware layers.

A user told me he was bothered by network performance of a virtual machine. It’s slow to copy data to NFS share. But responding to “ping” command looked good. I didn’t see any issue on virtual machine layer. VMware Tools was up to date, Windows OS was patched, virtual network adapter type was VMXNET3 and VM version was also up to date.

When I tried to copy an image file to share folder of the virtual machine, I did see sometimes speed was fast, but sometimes not. Since I have two physical uplinks, it led me to guess it could be one of the uplinks.

After a lot of swapping and cable changing, we eventually figured out there was a bad SFP on network switch end. I was able to observe the issue by using “psping.exe” of Microsoft Sysinternals. I used the following command to send the different size of ping package to the virtual machine. Network drops were increasing when I increased package size.

psping.exe -l <size of package> <Destination>
Example: psping.exe -l 4k xxxx.contoso.com

The size could be 1k, 2m or even larger. I think this is a good way to identify problem outside of ESXi. Especially SFP problem as such kind of problem didn’t give any CRC or error count on network switch level.

You can also use Windows native command “ping.exe” as following. The size unit is “bytes”. For example, you need to input 4096 if you want to send 4kb.

ping.exe -l <size> <Destination>
Example: ping.exe -l 4096 xxx.contoso.com

 

 

IE 11 Window Doesn’t Change Between 4K Internal and Regular External Monitors

Just a quick notes. If you use multiple monitors, some are 4K and some are regular resolution, you may see window display issue when move Internet Explorer between these monitors. Follow the KB below to change register to allow Internet Explorer 11 accommodates the monitor solutions.

Internet Explorer 11 window display changes between a built-in device monitor and an external monitor