Two of our vSAN clusters consist of VxRail S570 nodes with Intel X710 NICs. A few weeks ago the ESXi 6.5 hosts failed again and again. Not all at once but a different host every time. Failed means the host was displayed as “not responding” in the vSphere client and VMs stopped running and were restarted by vSphere Availability on other hosts.
Of course we thought of a network error at first, but on the physical upstream switches there were no related events in the logs and also all other hosts were not affected.