Potential Clusters for Experiment (WIP)

This page includes clusters that are potentially suitable for evaluating Homa Transport. To be more specific, we are particularly interested in low-latency Ethernet NIC and switch with (1) at least 8 data queues and (2) strict-priority (SP) queuing support for 802.1p tagged traffic.


Candidate 1: d430 nodes at Emulab Utah

Summary

  • 160 nodes (4 racks x 40 nodes/rack) in total
  • 1 dual-port or 1 quad-port Intel X710 10GbE PCI-Express NICs
    • probably support strict priority on 802.1p tagged traffic: "High priority packets are processed before lower priority packets"[2]
  • 1 Dell Z9500 132-port 40Gb switch (latency 600ns - 2us, 8 QoS data queues) connects all 10Gb interfaces


QoS Configuration

NIC


Switch

If we had complete control over the switch (which is not very likely because it connects all 160 nodes), using the global QoS configuration commands (documented in Chapter 44 of the Command-Line Reference Guide) would suffice. Otherwise, we have to further look into the policy-based QoS configuration (the overall paradigm/pipeline seems quite complex)?


[1] Intel X710 10GbE product brief: http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-x710-brief.pdf

[2] User Guides for Intel Ethernet Adapters: https://downloadcenter.intel.com/download/11848/User-Guides-for-Intel-Ethernet-Adapters?product=75021

Utah Emulab hardware in general: https://wiki.emulab.net/wiki/Emulab/wiki/UtahHardware

Dell R430 (aka "d430") nodes: https://wiki.emulab.net/Emulab/wiki/d430

Emulab data plane layout: http://www.emulab.net/doc/docwrapper.php3?docname=topo.html

Dell Z9500 spec sheet: http://www.netsolutionworks.com/datasheets/Dell_Networking_Z9500_SpecSheet.pdf

Dell Z9500 Configuration Guide & Command-Line Reference Guide: http://www.dell.com/support/home/us/en/04/product-support/product/force10-z9500/manuals


Candidate 2: m510 nodes at CloudLab Utah

Summary

  • 270 nodes (6 chassis x 45 nodes/chassis) in total
  • Dual-port Mellanox ConnectX-3 10 Gb NIC (PCIe v3.0, 8 lanes)
    • meaning? (btw, I couldn't find enough info about the number of queues or if it support SP queuing)
  • 2 HP Moonshot 45XGc switches (45x10Gb ports, 4x40Gb ports for uplink to the core, latency < 1us, cut-through, 8 data queues) per chassis
  • 1 large HP FlexFabric 12910 switch (96x40Gb ports, full bisection bandwidth internally, latency 6 - 16us) connecting all chassis
  • They "have plans to enable some users to allocate entire chassis ... to have complete administrator control over the switches"


QoS Configuration

NIC


Switch



CloudLab Utah cluster hardware specs: 

http://docs.cloudlab.us/hardware.html#%28part._cloudlab-utah%29

https://www.cloudlab.us/hardware.php#utah

ConnectX-3 Dual-Port 10 GbE Adapters w/ PCI Express 3.0: http://www.mellanox.com/page/products_dyn?product_family=127

HP Moonshot 45XGc switch QuickSpecs: https://www.hpe.com/h20195/v2/getpdf.aspx/c04384058.pdf

HP Moonshot-45XGc Switch Module Manuals: http://h20565.www2.hpe.com/portal/site/hpsc/public/psi/home/?sp4ts.oid=7398915#manuals


Candidate 3: ?

TODO


Verify Priority Actually Works

vconfig + iperf?

Miscellaneous

Relationship between CloudLab and Emulab/AptLab/etc.: https://www.cloudlab.us/cluster-graphs.php