Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

  1. Strange bandwidth issues:  ib_send_bw tests on second rack (rc41-rc80 with ConnectX-3 nics and 56Gbps SwitchX switches) only show ~1650MB/s between nodes.
    1. Bandwidths are apparently fine (~3200MB/s) between rc01-rc40, which are on the old switches (though they route through the new switches)
    2. Strangely, bandwidths between nodes in first rack and second rack are fine (~3200MB/s)
      1. So it appears to be a combination of new nics talking to new nics!?
  2. rc43's infiniband interface isn't detected. It probably needs to be re-seated in the pci-e slot.
  3. A handful of ports on the second rack are down:  rc45, rc53, rc59, rc63, rc70
    1. Figure out if these are cable, nic, or switch port issues.
  4. Update SSD firmware on all drives to version 0309 (http://www.crucial.com/support/firmware.aspx). Otherwise they'll start crashing after being up for >= 5184 hours.
    1. Perhaps easiest to hexedit the bootable updater's script to flash without any interaction and PXE boot the update on all machines?
  • No labels