Objective 8.1 – Perform basic troubleshooting for ESX/ESXi hosts
Understand general ESX server troubleshooting guidelines
Should you start troubleshooting the management or virtualisation side? – this will depend on the problem.
The vSphere client will give you a graphical view of the confgiuration, storage and networking etc.
Alternatively you can login via to a host using SSH or connect to the VMware Management Assistant (vMA) or vCLI environments then connect to that particular host.
ESX host troubleshooting:
search /var/log/messages, /var/log/vmkernel, /var/log/vmwarning and /var/log/secure.
commands start esxcfg-
ESXi host troubleshooting:
login to the technical support mode (Alt F1)
search /var/log/messages and /var/log/vmware/hostd.-x.log
most commands start esxcfg- but there is vim-cmd which is helpful when view virtual machines. If you use the vCLI tools then the commands start vicfg-
Troubleshooting Common Installation Issues
The most common issues are installing ESX/ESXi on hardware which isn’t on the HCL; doing this will cause problems, incorrectly partitioning the disk; i.e. partitioning SAN storage when the installation should be local or booting from SAN and presenting LUNs that belong to other ESX/ESXi installations; more info here. The third most common issue is detecting which network adapter you should be using, ESX/ESXi list the network adapters by PCI address. The best way to identity which network adapter is which would be to plug a network cable into the network adapter and type esxcfg-nics -l; this will luist the physical network adapters and their statuses. When you know which adapter to use ensuring you have the correct IP address, subnet mask, gateway and DNS server IP addresses is essential. More information on the installation best practice can be found here.
Monitoring ESX Server System Health
ESX/ESXi has builtin hardware monitoring, gone are the days of installing Dell OpenManage or the HP eqivalent. The builtin hardware monitoring will monitor the health of:
- System voltage
- System temperature
- RAID battery
- Cabling and interconnect
- Software components
The above health montoring can be viewed via the hardware status tab when a ESX/ESXi host is selected.
Understanding how to export diagnostic data
PSOD – Purple screen of death.
If ESX/ESXi encounters a purple screen of death then the memory and runtime logs will be dumped into the vmkcore partition. When the ESX/ESXi server reboots the memory and runtime log dump are archived into vmkernel-zdump-<datestamp>.zip in /root or /var/core for ESX and ESXi respectively.
Diagnostic data can also be exported via the vSphere client.
ESX/ESXi standalone: File > Export > Export System Logs
vCenter: File > Export > Export System Logs (this includes data from vCenter also)