Private IP Address Routes to L3 Subnet on Dual vNIC VM

It’s not easy for me to describe the issue in one line on the title. Let me give some background here. I have 2 set of VMs. Set 1 has VM A & VM B. Set 2 has VM C & VM D. Each VM has a vNIC configured with a private IP address. VM A and VM C also have another vNIC configured with an L3 (Routable) IP address. Each set’s private IP addresses are the same. To make sure no confusion I implemented a vRouter VM for each set. The vRouter is same as VM A or VM C, it has two vNICs. One is connected to L3 network, another is connected to the private network. This way can keep the private network traffic not going outside of the set. So the both set no disturb each other when I set same private IP addresses.

Diagram

Following are IP addresses I set for each VM:

  • VM A: 192.168.0.11
  • VM B: 192.168.0.12
  • VM C: 192.168.0.11
  • VM D: 192.168.0.12

The problem is I still can get ping responding on VM A to 192.168.0.12 when I turn off VM B. I expected to see the L2 traffic goes to it own vRouter and finds VM B is offline. But tracert command shows me the traffic goes from VM A’s L3 network to vRouter of the 2nd set, and then get the answer from VM D. Looks like the L2 ping package is broadcasting on L3 network.

The issue was fixed by enabling a feature on L3 network. It called “Enforce Subnet Check for IP Learning“. Cisco changed the name to “Limit IP Learning To Subnet“. It’s a VLAN level setting. It will not allow broadcasting the private Ip traffic on an L3 network. It forces private IP traffic to go to L2 network only.

Vlan ‘xxx’ resolved to unsupported VLAN ID in Cisco UCS Manager

You may need only 1 IP address for blade console in Cisco UCS Manager. You can follow Understanding “Management IP” of Cisco UCS Manager to configure it. You may see warning “Vlan ‘xxx’ resolved to unsupported VLAN ID” when you delete existing inbound and outbound IP pools if you are trying to clean up existing management IP pools.

That’s because inbound IP address for blade is not cleaned. You have to go to “Equipment” -> “Chassis” -> Target chassis -> “Servers” -> Target server -> Go to “Inventory” tab -> “CIMC” tab -> Click “Change Inbound Management IP” -> Remove existing VLAN and IP pool.

You will see inband IP tab is blank once it’s saved. Please note, the IP address reassign back after 1 minute if you clicked “Delete Inband Configuration” instead of “Change Inbound Managemnt IP“.

Understanding “Management IP” of Cisco UCS Manager

IP address for KVM in Cisco UCS Manager is different with HPE servers. It may assign multiple IP addresses to same blade if you don’t configure it properly. In my case each blade gets 3 IP addresses!

There are actually 3 types of IP address for KVM. (Cisco manual says 2):

  • Outbound Management IPs.
  • Inbound Management IPs for Blades.
  • Inbound Management IPs for Service Profiles.

Outbound Management IP” is default for KVM. Every new deployed blade will try to get a DHCP IP over management port in same VLAN of Cisco UCS Manager.

The more confused is the 2nd and 3rd IPs.  “Inbound Management IPs for Blades” is from “hardware” perspective. “Inbound Management IPs for Service Profiles” is from “logical” perspective.

If you go to “Equipment” -> Chassis -> blade -> Click the KVM to go console. You get console over either “Outbound Management IP” or “Inbound Management IPs for Blades“.

If you go to “Servers” -> “Service Profiles” -> Click the KVM of a service profile. You get console over either “Outbound Management IP” or “Inbound Management IPs for Service Profiles”.

If you want to configure just 1 IP for a blade whatever it’s for hardware or service profile. You need to do following:

  1. Delete the range of the default “ext-mgmt” in “IP Pools” of “LAN” node in Cisco UCS Manager.
  2. Create a new inbound IP pool and a VLAN group without uplink.
  3. Assign the VLAN and inbound IP pool to templates or service profile.

Refer to Setting the Management IP Address of Cisco UCS Manager manual for detail.

BTW, you may see Vlan ‘xxx’ resolved to unsupported VLAN ID in Cisco UCS Manager when you clean up existing IP pool and create new inbound pool.

“x/xx on FI-A is connected by a unknown server device” on Cisco UCS

You may see following errors in ‘info’ category of error messages in the Cisco UCS Manager after upgrading infrastructure firmware to 3.2.x.

“x/xx on FI-A is connected by a unknown server device”

This is bug documented in CSCvk76095. You have to reset the port on FI to fix it.

  1. Go to “Equipment” in Cisco UCS Manager.
  2. Go to “Fabric Interconnects” -> Go to the corresponding FI.
  3. Right-click the port x/xx -> Choose “Disable“.
  4. You will see multiple major faults. Wait for 5 seconds.
  5. Right click the port x/xx -> Choose “Enable“.
  6. All warnings disappeared after 5 mins. You may still see the warning in GUI due to cache. Relogin and check.

This change impacts to one link between IOM and the FI port. You need downtime if the IOM only has a single path. I don’t see any impact to ESXi blades in the pod.

Show CDP Neighbor of Cisco UCS Uplinks

There are two ways to know which network switch ports the network uplinks of Cisco UCS Fabric Interconnects are connected to.

By CLI

  • SSH to the Cisco UCS Manager.
  • Connect to FI-A.
# connect nxos a
  • Show neighbor of network uplinks.
# show cdp neighbor interface ethernet <port num>

By PowerShell

  • Make sure Cisco PowerTool (For UCS Manager) is installed.
  • Enabling the Information Policy via UCSM GUI.
    • Go to “Equipment” -> “Policies” tab -> “Global Policies” tab -> “Info Policy” area.
    • Change to “Enabled“. (No impact to running blades)
  • Open a PowerShell window.
  • Connect to the UCS Manager.
# Connect-Ucs <UCS FQDN>
  • Show CDP neighbor details.
# Get-UcsNetworkLanNeighborEntry

Side notes

Following command can shows network switch name, network switch ports and FI ports

# Get-UcsNetworkLanNeighborEntry | Select deviceid,remoteinterface,localinterface

If you prefer to enable the “Info Policy” by PowerShell, run following command

# Get-UcsTopInfoPolicy | Set-UcsTopInfoPolicy -State enabled -Force

“default Keyring’s certificate is invalid” in Cisco UCS Manager

You may see following error in Cisco UCS Manager:

default Keyring’s certificate is invalid

The reason is Admin -> Key Management -> KeyRing default is expired. It’s not possible to delete or change the KeyRing in GUI. You have to log in to SSH of Cisco UCS Manager and run following commands (The strings after “#”):

lab-B# scope security
lab-B /security # scope keyring default
lab-B /security/keyring # set regenerate yes
lab-B /security/keyring* # commit-buffer
lab-B /security/keyring #

This will result in a disconnect of the Cisco UCS Manager GUI on your client computer. Just refreshing the page after 5 seconds. It’s no impact to blades.

A Huge Amount of Warnings of “Image is Deleted” in Cisco UCS Manager

A few days ago, I deleted some older firmware packages in Cisco UCS Manager. Suddenly more than 100 warnings were generated. The error messages are similar below:

blade-controller image with vendor Cisco System Inc……is deleted

Cause: image-deleted

Clearly, it’s triggered due to packages deletion. But all of my service profiles and service profile templates were using existing firmware packages. The deleted packages were not been used anywhere.

I also deleted download tasks and cleaned up everything I can. The warnings still persisted. I figured out it’s caused by the default firmware policy when I read a blog article.

In case you are facing same issue. Please go to Servers -> Policies -> Host Firmware Packages -> default ->  Click Modify Package Versions -> Change it to available version.

 

Cisco UCS Blade Cannot Get IP Address for KVM

You may see “The IP address to reach the server is not set” when clicking the KVM console in Cisco UCS Manager. The issue persists even Cisco UCS Manager has enough IP addresses for management. Re-acknowledge or reset CIMC cannot fix the problem.

The fix procedure is go to “Equipment” -> Select the server -> “General” tab -> “Server Maintenance” -> “Decommission” the server.

Wait for the decommission completed, then re-acknowledge the server. IP address will be assigned to the server after the acknowledge process is completed.

Memory Errors on Modern Servers

I used to see memory degrading on  Cisco  UCS blades. But less see on HPE blades. I thought it maybe quality control problem of Cisco manufacture. Today I read two articles in Cisco website, it explains why we see memory degrading and how it works. I attached the articles below.

Managing Correctable Memory Errors on Cisco UCS Servers

UCS Enhanced Memory Error Management

The conduction in the whitepaper is not only specific for Cisco UCS, but also for any modern servers. Following is summary of why memory errors rates is going high nowadays.

  • Larger memory systems contain more bits
  • Higher capacity DRAM chips require smaller bit cells which result in fewer stored charges per bit
  • Lower operating voltages can lead to reduced noise margin
  • Higher operating speeds can lead to reduced timing margin

Cisco UCS blade B200 M4 discovery pending on 58%

New B200 M4 blades can running on Intel v4 processors. You may see discovery issue if your UCSM firmware version lower than 2.2.7c. I hit that problem few days ago when I install a new M4 blade. The FSM hung on 58% a real long time and failed eventually.

Continue reading “Cisco UCS blade B200 M4 discovery pending on 58%”