- Strange bandwidth issues: ib_send_bw tests on second rack (rc41-rc80 with ConnectX-3 nics and 56Gbps SwitchX switches) only show ~1650MB/s between nodes.
- Bandwidths are apparently fine (~3200MB/s) between rc01-rc40, which are on the old switches (though they route through the new switches)
- Strangely, bandwidths between nodes in first rack and second rack are fine (~3200MB/s)
- So it appears to be a combination of new nics talking to new nics!?
- rc43's infiniband interface isn't detected. It probably needs to be re-seated in the pci-e slot.
- A handful of ports on the second rack are down: rc45, rc53, rc59, rc63, rc70
- Figure out if these are cable, nic, or switch port issues.
- Update SSD firmware on all drives to version 0309 (http://www.crucial.com/support/firmware.aspx). Otherwise they'll start crashing after being up for >= 5184 hours.
- Perhaps easiest to hexedit the bootable updater's script to flash without any interaction and PXE boot the update on all machines?
Manage space
Manage content
Integrations