I used to see memory degrading on Cisco UCS blades. But less see on HPE blades. I thought it maybe quality control problem of Cisco manufacture. Today I read two articles in Cisco website, it explains why we see memory degrading and how it works. I attached the articles below.
Managing Correctable Memory Errors on Cisco UCS Servers
UCS Enhanced Memory Error Management
The conduction in the whitepaper is not only specific for Cisco UCS, but also for any modern servers. Following is summary of why memory errors rates is going high nowadays.
- Larger memory systems contain more bits
- Higher capacity DRAM chips require smaller bit cells which result in fewer stored charges per bit
- Lower operating voltages can lead to reduced noise margin
- Higher operating speeds can lead to reduced timing margin
DBA team told me Oracle was running slow on a HPE server. I observed the CPU utilization was about 50% of overall capacity. Whenever Oracle database bumps up the system experienced slowness.
Further digged into the issue, I see Oracle workload only ran on single physical processor, but the server has two processors. And the Windows 2012 R2 resource manager show the system used Processor Group, the two physical processors were grouped out. This technology is described in Microsoft MSDN article.
To fix the issue you have to change value of “NUMA Group Size Optimization” to “Flat” in BIOS. Please refer to HPE article for detail steps.
Detail of HPE server behavior is documented here. Please note, the article says it impacts to ProLiant Gen9 and Intel E5-26xx v3 processors. But it actually also impacts to Intel E5-26xx v4 and Synergy blades.
You may see following problem if you login vCenter Server 6.0 by vSphere Client:
Login to the query service failed.
The server could not interpret the communication from the client. (The remote server returned an error: (500) Internal Server Error.)
That’s because “Use Windows session credentials” checkbox is selected. Deselect it and give it a try.
Refer KB Searching the Inventory with the vSphere Client fails (2143566)
Just a quick post. When virtual machine cannot get DHCP IP address the first thing you want to check is firewall. Whatever Windows firewall or physical firewall. You should make sure UDP port 67 and 68 are not blocked. Otherwise you will see the virtual machine gets 169.x.x.x IP address only.
The two ports is required for DHCP client to query IP addresses. The methodology is introduced in RFC document.
DHCP uses UDP as its transport protocol. DHCP messages from a client
to a server are sent to the ‘DHCP server’ port (67), and DHCP
messages from a server to a client are sent to the ‘DHCP client’ port
(68). A server with multiple network address (e.g., a multi-homed
host) MAY use any of its network addresses in outgoing DHCP messages.
I also got some ideas in this post.
You may see ‘Adobe Flash Player Out of Date’ on Chrome when you open vSphere Web Client. Click the text Chrome will update Flash Player automatically. But in some cases it doesn’t work due to maybe your Chrome is controlled by company policy or internet problem to Adobe.com. I found an article to show how to offline fix the issue. You can download Flash Player for Opera and Chromium-based browsers – PPAPI in official Adobe KB article.
You may also want to check out my other articles about Flash issue on browsers.
Flash menu appears when right click on vSphere Web Client in Chrome
Cannot open vSphere Web Client on IE11 on Windows 8.1
Slight network latency may cause application problem on sensitive virtual machines. Even the network responding time is just 3 or 7 ms. There is a way to improve the stability of responding latency – Enable RSS on NIC.
Network traffic is handled by single CPU core when RSS is disabled. Enable it will distribute the workload to 4 cores by default. You can increase CPU for RSS by change registry.
To summarize the solution. Go to Device Manager -> NIC properties -> Advance -> Find RSS option and enable it. You will see 2 – 3 network drops when applying it.
You can refer following articles for detail.
Poor network performance or high network latency on Windows virtual machines
Virtual Receive-side Scaling in Windows Server 2012 R2
Regarding increase CPU for RSS. Read following article to learn how to modify it.
Setting the Number of RSS Processors
If your company implemented firewall and blocked public NTP server, you may see installation of vRealize Operation Manager pending on ./install.sh on console. That’s because the installer tries to negotiate with NTP server http://www.iana.org. The firewall blocked the traffic.
VMware TAM Manager Shan told me there are two options on firewall to block traffic: REJECT and DROP. REJECT means firewall responding to the request and let source device knows it’s rejected. DROP means firewall immediately ignores the request and no responding to source device. Looks like there is a bug in vROPs code that it hung if NTP request gets drop and no responding.
The workaround is create a port group without physical uplinks and install vRealize Operation Manager. Then move it to proper network after installation is completed. You can configure correct IP addresses when import the OVF file so later on you just need simply move the network.