Page Comparison

...

MSR's Monsoon vs. UCSD's Fat-Tree commodity system
- Want network connecting many 1GigE nodes with no oversubscription
- MSR uses a hierarchical configuration (10GigE aggregation and core switches, 1GigE TOR switches)
- UCSD's uses identical, 24-port, commodity 1GigE switches (i.e. k = 48)
- Both theoretically capable of 1:1 oversubscription (i.e. no oversubscription)

	Hierarchical	Fat-tree
# hosts	25,920	27,648
# switches	108 x 144-port 10GigE 1,296 x 20-port 1GigE w/ 2x10GigE uplinks	2,880 x 48-port 1GigE
# wires	57,024 (~91% GigE, ~9% 10GigE)	82,944
# unique paths	144 (36 via core with 2x dual uplinks in each subtree)	572

Ideas from supercomputing:
- Hypercubes
- Torus's (?Tori?)
  - IBM Blue Gene connects tens of thousands of CPUs with high bandwidth (e.g. 380MB/sec with 4.5usec avg. ping-pongs - link)
Hosts connect to n neighbours and route amongst themselves
- Requires hosts to route frames
  - => higher latencies, unless we can do it on the NIC (NetFPGA?)
- High wiring complexity
  - no idea how this compares to already high complexity of hierarchical and, especially, fat-tree topologies
- May impose greater constraints on cluster geometry to appropriately establish links??
- No dedicated switching elements, simpler (electrically) point-to-point links

If networking costs only small part of total DC cost, why is there oversubscription currently?
- it's possible to pay more and reduce oversubscription - cost doesn't seem the major factor
- but people argue that oversubscription leads to significant bottlenecks in real DCs
  - but, then, why aren't they reducing oversubscription from the get go?

Versions Compared