The above appears to be due to some NICs coming up as pcie x4 (rather than x8) at boot. This can be checked for by doing lspci -d 15b3: -xxx |grep ^70 |cut -d " " -f 4. If you see '82', all's good. 42 means x4. Something else is probably even worse.

It appears to have come back after a reboot or power cycle. Perhaps it's related to the x4 vs. x8 pcie issue.

Try running ibdiagnet -P all=1 as root and look at /var/tmp/ibdiagnet2/ibdiagnet2.log to find out what links look dubious (i.e. are experiencing bit errors).