This is the first article of my VMware NSX Troubleshooting series. I am aware that already a lot of other blog post around this topic have been published, but for me this post is also a part of my learning process in VMware NSX. Writing about a topic helps me to get a deeper understanding of my existing knowledge gaps. I will start this series of posts with the NSX Management and Control Plane
The new VMware vCenter HTML5 Client Integration for NSX has a great summary of the environment state, but within this blog post we will dig a little bit deeper into the individual components of the NSX Management and Control Plane.
To get started with NSX Troubleshooting I would recommend the VMware NSX Troubleshooting Guide and the NSX Command Line Quick Reference. The great NSX 6.4 Control Plane Logical View from Tim Sandy will help you to understand the relations between the NSX Management, Control and Data Plane
Troubleshooting NSX Manager
The NSX Manager is the first component of the NSX Management and Control Plane. To be exact its one of two components of the Management Plane the other one is the VMware vCenter itself. The most environments may only have one NSX Manager, but there is also a setup with a primary and multiple secondary NSX Managers possible in a cross-vCenter NSX environment.
Check NSX Manager file system
Show the file system usage on the NSX Manager.
|
|
Monitor NSX Manager processes
Show currently running processes on the NSX Manager.
|
|
Check NSX Manager Logs
Shows the appmgmt, manager, or system log of the NSX Manager.
|
|
Additional options:
Option | Description |
---|---|
follow | Update the displayed log |
reverse | Show the log in reverse chronological order |
last n | Show the last n number of events in the log |
NSX Manager Packet Capture
Display all packets captured by an NSX Manager interface. This example shows all http and https packets on the Management interface (The expression is a tcpdump-formatted string):
|
|
Note: Enabled mode required
Verify NSX Manager date and time
Show the current time and date of the NSX Manager.
|
|
Troubleshooting NSX Controller
The NSX Controller Cluster is the second part of the NSX Management and Control Plane we look at within this post. The Controller Cluster needs (in VMware NSX 6.4.0) exact three nodes for a proper setup. NSX Edges and NSX DLR Control VMs can also be counted into the Control Plane, but I will do a separate post for these components.
Identify NSX Controllers
Show all controller nodes in the Controller Cluster.
NSX Manager:
|
|
NSX UI:
Check NSX Controller Interfaces
Show the IP configuration of the NSX Controller.
|
|
Show NSX Controller TCP Connections
Show active TCP connections of the NSX Controller.
|
|
Show NSX Controller TCP Dump
Run a TCP Dump for the NSX Controller management interface.
|
|
Show NSX Controller Cluster Status
Show the Controller Cluster status per controller. In case of a problem, this should be verified on each controller in the cluster.
|
|
Show Controller Cluster Roles
Show active roles per controller. In case of a problem, this should be verified on each controller in the cluster.
|
|
Show Controller Cluster Connections
Show Controller Cluster connections for the individual roles. In case of a problem, this should be verified on each controller in the cluster.
|
|
Show NSX Controller Cluster History
Show the event history of the Controller Cluster.
|
|
Check NSX Controller Logs
Check the NSX Controller Logs for know issues, errors and warnings.
Slow Disk:
|
|
Disk space usage:
|
|
Main controller log - warnings and errors:
|
|
Troubleshooting ESXi Host
In my opinion at least some components of the ESXi Hosts count into the NSX Management and Control Plane, but it becomes blurry with the Data Plane.
Verify the NSX VIB installation
Verify that the NSX-V VIB is installed on the on the ESXi Host.
|
|
Verify currently loaded NSX Modules
Verify that all the NSX modules are currently loaded in the ESXi system.
|
|
Verify VXLAN IP Connection between ESXi Hosts
Verify the connection between all the VTEPs in your environment.
Identify the NSX VMkernel interfaces:
|
|
Run IP connection test:
|
|
View Routing table for the VXLAN TCP/IP Stack:
|
|
View ARP table for the VXLAN TCP/IP Stack:
|
|
Run NSX Host Health Check
Show details of the health status of the specified ESXi Host.
Identify ESXi Host ID:
|
|
|
|
Run NSX Host Health Check:
|
|
Query API for ESXi Management and Control Plane connection
Query the NSX Manager API for all ESXi Hosts Management and Control Plane connection states.
|
|
Query API for ESXi Management and Control Plane connection details
There is another NSX Manager API query for a more detailed status of a single ESXi Host.
|
|
Known Error Codes:
- 1255602: Incomplete Controller Certificate
- 1255603: SSL Handshake Failure
- 1255604: Connection Refused
- 1255605: Keep-alive Timeout
- 1255606: SSL Exception
- 1255607: Bad Message
- 1255620: Unknown Error
Verify Control Plane Agent Status
Verify the control plane Agent (netcpad) status on the ESXi Hosts.
|
|
Verify Stateful Firewall Service
Verify the Stateful Firewall Service (vShield-Stateful-Firewall) status on the ESXi Hosts.
|
|
Check the Control Plane Agent Configuration
Check the control plane agent configuration on the ESXi Hosts. The IP addresses of all NSX Controllers should be listed.
|
|
Verify that Stateful Firewall Service is configured
Verify that Stateful Firewall Service is configured to the NSX Manager IP.
|
|
Verify IP connection to the Controllers
Verify the active IP connections from the ESXi Host to all controllers in the NSX Controller-Cluster.
|
|
Verify IP connection to the Manager
Verify the active IP connections from the ESXi Host to the NSX Manager.
|
|
Check the Control Plane for the Logical Switches
Check the state of the VXLAN control plane for the logical switches on the ESXi Host.
Identify the VXLAN DVS:
|
|
Check VXLAN control plane state:
|
|
Show Control Plane Agent Log Files
Show the control plane Agent log files on the ESXi Host.
|
|
Show Stateful Firewall Service Log Files
Show the Stateful Firewall Service (message bus client) log files on the ESXi Host.
|
|
External references
- VMware NSX Troubleshooting Guide
- VMware NSX Command Line Reference
- VMware NSX Command Line Quick Reference
- VMware NSX for vSphere Troubleshooting Deep Dive by René van den Bedem
- NSX 6.4 Control Plane Logical View by Tim Sandy
- VMware NSX-V Control and Management Plane Connections Diagram by Martijn Smit