We recently encountered a problem which nearly ground our VMWare ESX farm to a halt.
The cause of the problem was a iscsi lock caused by 2 hosts trying to write to the same store at the same time.
This was evident from messages on the ESX hosts from the error message: vprob.vmfs.heartbeat.timedout and referencing one of the volumes on our ISCSI storage.
This was causing the entire ESX host to have connectivity problems as well as affecting the guests that resided on that volume.
Because the host and guests were not accesible through vSphere, we were unable to remove the volume or power cycle the guests.
After much digging around and with the help of VMWare support we understood that the cause of the problem was a lock on that filesystem, and to fix the problem we ran vmkfstools -L lunreset /vmfs/devices/disks/volumename…
This removed the lock which was caused by 2 hosts trying to write to the same volume at the same time and causing a iscsi lock.
Very painful, but happy to have the cluster back up and running.