Performing basic troubleshooting for storage
Identifying storage contention issues
The adapter queue depth should be investigated; use esxtop -U or resxtop -U to check the number of requests.
To check the current queue or other storage controller settings; use vmkload – l | grep qla2xxx or lpfcdd, to change the storage controller configuration use esxcfg-module -s.
To get an idea how much data us being read or written per controller select the controller then disk read rate, disk write rate and disk usage from within the performance charts; selecting a stacked graph will show virtual machine vs. virtual machine consumption. You can also drill down into virtual machine itself to see virtual machine disk usage.
If one virtual machine is using too much disk I/O that other virtual machines are suffering disk latency problems then Disk.SchedNumReqOutstanding can help; this settings basically limits how many I/Os can be put in the storage controller driver queue per virtual machine. Disk.SchedNumReqOutstanding only applies to virtual machine using the same datastore and the VMkernel will need to have witnessed six I/O switches before throttling the number of requests each virtual machine can send.
SCSI reservations are another area which can cause disk contention this is because each reservation causes the VMkernel to lock the datastore; SCSI reservations are caused by:
- Powering on a virtual machine
- snapshot creation
- snapshot or virtual disk growth
- deletion of files on the datastore
Identifying storage overcommitment issues
Overcommitment centres on thin provisioning; thin provisioning allows large disks to be created but only what is used to actually allocated out. This can lead to provisioned storage being greater than the datastore capacity; this is where VMware alarms come in handy; use datastore disk overallocation (%) trigger to alert when the percentage reaches on defined threshold.
Identifying storage connectivity issues
A simple way to monitor storage connectivity is to use the predefined alarm ‘cannot connect to storage’ has several triggers:
- Lost storage connectivity
- lost storage path redundancy
- degraded storage path redundancy
Datastore usage on disk is useful to alert on volumes reaching capacity.
Identifying iSCSI software initator configuration issues
iSCSI communication can use one or two VMkernel ports (two for redundancy) each with their own IP address; connectivity can be tested using vmkping.
The iSCSI initiator must be enabled and VMkernel ports configured in order to discover iSCSI targets. The iSCSI IQN must be configured correctly in order for the iSCSI appliance to mask the LUN correctly.
esxcli is used to configure VMkernel ports and uplinks.
esxcfg-vswitch -M <MTU> is used to configure the MTU of a vSwitch.
CHAP authentication is now supported at the host and or target iSCSI appliance; CHAP uses a preshared passphrase which should be documented.
Intepreting storage reports and storage maps
These reports show capacity, free space and snapshot space consumption.
Maps allow users to build storage maps using relationships between inventory and storage objects. Maps also has a zoom and export to Visio facility.