Network Problems of Auto Deployed ESXi Host in LAB

I built a simple Auto Deploy environment by vSphere 6.5 on nested environment. I created virtual ESXi hosts on a physical ESXi host to do the testing. The whole configuration was smoothly, I’m impressed Auto Deploy can be implemented in few hours. One thing bothered me was networking.

New ESXi hosts cannot get IP addresses properly somehow. It’s not a single problem. The symptoms are ESXi hosts cannot get IP address, or the Configure Management Network was grayed out on console, or ESXi hosts can get IP address but no responding to ping. Just quick post my solutions here.

To fix all these problems you need to do following:

  1. Enable Promiscuous Mode on the vSwtich which is attached to nested ESXi hosts on physical ESXi hosts.
  2. (I did that on Web Client of vCenter 6.5 U1. You may see different procedure on earlier versions.) Edit the host profile of Auto DeployNetworking configurationHost port group — Highlight Management Network — The option Determine how MAC address for vmknic should be decided — Choose Use the MAC Address from which the system was PXE booted.

If you don’t do step 1, your nested ESXi hosts may not able to get DHCP IP addresses properly, or it can get IP addresses but maps to a new MAC address lead to network packages cannot be transmitted.

Nested ESXi hosts get a DHCP IP addresses when do PXE booting. The hosts get another new IP addresses when apply host profile as soon as management network is created. It could be two different IP addresses and the MAC address of management network could be a new one that not same to any of vmnics. It will be hard to trace back on network switch in real environment, so I think it’s better also to do step 2.

Update 10/25/2017 — You should choose “User must explicitly choose the policy option” in step 2 above if you have multiple NICs. The reason is DHCP IP address during PXE may be captured by random NICs. If you choose what I mentioned in step 2, you will see DHCP server may learns MAC address of a none management network NICs associated with management IP address. Please refer this article for more detail.

Maximum Supported Boot Devices in Virtual Machine BIOS

Noticed a interesting limitation on VMware virtual machines. If you configure multiple SCSI controllers and distribute more than  8 virtual  disks. You may experience randomly OS boot up failure when power cycle VMs. Only last 8 disks with higher SCSI ID present in boot order settings of BIOS. You cannot choose the disks with lower SCSI ID.

You need to following up VMware KB “Changing the boot order of a virtual machine using vmx options (2011654)” to force virtual machines boot up on proper SCSI node.

Automatic vSphere Capacity Report in PPT

Reporting is important to management. To be a IT Pro, you may need to run regular reports for management. Some reports may be generated time consume. vRealize Operations Manager is an alternative to create customized reports. It’s a powerful product to organize data and create PDF or CSV files on scheduled intervals. I recommend have a look if you have planned to implement performance, capacity and alarm system for virtual environment.

What if budget is constrained? Is there a way to create such kind of reports? The answer is “Yes”. I worked out an automatic workflow to create the reports. I will not provide step-by-step guide in this post since it’s advanced integration of multiple products, everyone may have different way to do that. You can even create everything by script if you have strong programming skill. I’m not, I only look for the easiest way to achieve the goal.

Here is a scenario for  example: I want to run a monthly report for vSphere CPU and memory count and present to management by PowerPoint. I want to show management the historical trend of CPU and memory data. The traditional way is collect data in vCenter, organize and create charts in PowerPoint slides. So the whole workflow is: vCenter -> PowerPoint

If you want to automate the whole process you need to introduce few things more: PowerCLI, CSV and Excel. You need to develop a PowerCLI script to grab CPU and memory data on vCenter Server, then export the data to a CSV table by PowerShell command export-csv. Then import the table to an Excel file by Office feature Query Data. It loads the CSV table dynamically, you can even specific what data can be queried by filter.

Once the table is present in Excel, you need to create a chart accordingly. It’s trick when you paste the chart to PowerPoint Slide. You need to use Paste Special to paste the chart as Microsoft Excel Chart Object. The pasted chart can be updated automatically when you open the PowerPoint file.

The last step is created a scheduled task to run the PowerCLI script. Make sure you read my blog Extremely slow when run PowerShell script by scheduled tasks before create the task.

You can also configure the Excel file to automatically update table by CSV file.

Cannot Launch Patch Installer on Windows Server 2016

I was trying to update one Windows Server 2016 by standalone patch file. Somehow nothing happened after I double click the installer file. That’s because Windows Server 2016 prevent execute the  file due to it’s download from internet.

The quick fix is right click the file – Properties – Check Unblock – Click OK button.

Further more. The file has ADS (alternate data streams) attached. The ADS marked the file as download from internet.

You can run following two PowerShell commands to figure out object and value of the ADS.

PS C:> Get-Item test file.msu -Stream *
PSPath : Microsoft.PowerShell.CoreFileSystem::C:Userswzheng110917aDow
 $DATA
PSParentPath : Microsoft.PowerShell.CoreFileSystem::C:Userswzheng110917aDow
PSChildName : 20171011_KB4038801_Updates.msu::$DATA
PSDrive : C
PSProvider : Microsoft.PowerShell.CoreFileSystem
PSIsContainer : False
FileName : C:Userswzheng110917aDownloads20171011_KB4038801_Updates.msu
Stream : :$DATA
Length : 1241376269

PSPath : Microsoft.PowerShell.CoreFileSystem::C:Userswzheng110917aDow
 one.Identifier
PSParentPath : Microsoft.PowerShell.CoreFileSystem::C:Userswzheng110917aDow
PSChildName : 20171011_KB4038801_Updates.msu:Zone.Identifier
PSDrive : C
PSProvider : Microsoft.PowerShell.CoreFileSystem
PSIsContainer : False
FileName : C:Userswzheng110917aDownloads20171011_KB4038801_Updates.msu
Stream : Zone.Identifier
Length : 26

 

PS C:> Get-Content testfile.msu -Stream Zone.Identifier
[ZoneTransfer]
ZoneId=3

You can  see the ZoneId is 3. Following is a table to show which type of file it is.

0     My Computer 
1     Local Intranet Zone 
2     Trusted sites Zone 
3     Internet Zone 
4     Restricted Sites Zone

For more reference please read Microsoft blog “Alternate Data Streams in NTFS“.

You can use Unblock-File if you want to unblock multiple files.

 

博客是记录思想的地方

科技的发展真是非常快,十年前我还在用Windows Server 2003 和 Windows XP。十年后的今天,我们已经初尝到人工智能的味道。有那么多的Apps、网站、技术帮助我们更快速的学习新知识,人们的生活节奏越来越快,甚至学习这个人类的基本技能也在传统方式上增加了“碎片时间”方式(一个被“逻辑思维”所倡导的,2017年很流行的学习新方式)。

各种高科技的今天,我们可以用智能手机记录生活中的点点滴滴,甚至影像资料。但是“思想”,这个人类智力的核心却是无法记录的,唯文字可以反映作者当时的状态、情绪和记忆。在快节奏的城市生活中,我们有时候是需要慢下来、停下来的。回头看看自己的过去,读读当时的思想,回忆回忆曾经的记忆。很久前网络上开始流行一句话,大意是“别走得太快,等一等灵魂”(可能是假的名人名言)。我想也许文字可以做到这点。

这个国庆假期,利用空闲时间在网络上搜索搭建SS服务器的资料,无意间看到逗比根据地上的一篇文章介绍一个给互联网上所有网站做历史快照的网站,就随手搜了搜我的过去,竟然无意间发现了自己十几年前写的博客。这些文章早已被我遗留在互联网的某个角落,忘记了。看起来那应该是我从MSN Live空间搬过去的,还依稀记得2000年那会儿博客非常火爆,互联网企业都在推各自的免费博客服务,微软也不例外,但是后来好像因为这种服务不赚钱,以及监管原因,大量的博客服务开始关停,微软也不例外。幸好我当时把文章都转移了,今天才有机会帮我回忆起当时的状态。

非常高兴我可以重读当年的心力路程,让我再次如身临其境般的回到那个时代、回到那个状态。我会尝试花一些时间记录生活,给我的未来留下些参考。

Memory Errors on Modern Servers

I used to see memory degrading on  Cisco  UCS blades. But less see on HPE blades. I thought it maybe quality control problem of Cisco manufacture. Today I read two articles in Cisco website, it explains why we see memory degrading and how it works. I attached the articles below.

Managing Correctable Memory Errors on Cisco UCS Servers

UCS Enhanced Memory Error Management

The conduction in the whitepaper is not only specific for Cisco UCS, but also for any modern servers. Following is summary of why memory errors rates is going high nowadays.

  • Larger memory systems contain more bits
  • Higher capacity DRAM chips require smaller bit cells which result in fewer stored charges per bit
  • Lower operating voltages can lead to reduced noise margin
  • Higher operating speeds can lead to reduced timing margin

Oracle Utilizes 50% of Physical Processors on HPE Server

DBA team told me Oracle was running slow on a HPE server. I observed the CPU utilization was about 50% of overall capacity. Whenever Oracle database bumps up the system experienced slowness.

Further  digged into the issue, I see Oracle workload only ran on single physical processor, but the server has two processors. And the  Windows 2012 R2 resource manager show the system used Processor Group, the two physical processors were grouped out. This technology is described in Microsoft MSDN article.

To fix the issue you have to change value of “NUMA Group Size Optimization” to “Flat” in BIOS. Please refer to HPE article for detail  steps.

Detail of HPE server behavior  is documented here. Please note, the article says it impacts to ProLiant Gen9 and Intel E5-26xx v3 processors. But it actually also impacts to Intel E5-26xx v4 and Synergy blades.

虚拟主机无法获得DHCP IP地址

刚解决了一个问题,快速更新一下。当虚拟主机无法获得DHCP IP地址时,应该做的第一件事情是检查防火墙,无论是Windows防火墙或者物理防火墙。UDP端口67和68不能被阻挡掉。否则会出现虚拟主机只能获得169.x.x.x的IP地址,这个地址是不可用的,表示虚拟主机无法从DHCP服务器获得地址。

这两个端口是DHCP客户端用来从DHCP服务器获取IP地址的。具体的技术细节可以参考RFC文档

DHCP uses UDP as its transport protocol. DHCP messages from a client

to a server are sent to the ‘DHCP server’ port (67), and DHCP

messages from a server to a client are sent to the ‘DHCP client’ port

(68). A server with multiple network address (e.g., a multi-homed

host) MAY use any of its network addresses in outgoing DHCP messages.

我在排错过程中也用到了这篇文章。

搜索虚拟主机时提示:Login To The Query Service Failed

使用vSphere Client登录vCenter Server 6.0时可能会出现如下报错信息:

Login to the query service failed.

The server could not interpret the communication from the client. (The remote server returned an error: (500) Internal Server Error.)

这是因为在登录vSphere Client时勾选了”Use Windows session credentials“。试试取消它。

相关知识库链接:Searching the Inventory with the vSphere Client fails (2143566)