Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Network Substate

Data Center Networks

  • Current data centers purported to be highly specialized
    • hierarchical network topologies with higher bandwidth aggregation and core switches/routers
      • that is, data rates increase up the tree to handle accumulation of bandwidth used by many, slower leaves further down
    • requires big, specialised switches to maintain reasonable bandwidth
      • e.g. 100+ 10GigE switches with >100 ports each, at the core
        • pricey... Woven Systems 144 ports 10GigE switch debuted at $1500/port in mid-2007
          • current cost unknown
    • oversubscription is purportedly common
      • mainly affects bi-section bandwidth (the data center isn't uniform - locality important, else lower bandwidth expectations)
      • implies congestion is possible, adding overhead for reliable protocols and packet latency
      • 2.5:1 to 8:1 ratios quoted by Al-Fares, et al ('08 SIGCOMM)
        • 2.5:1 means for every 2.5Gb at the end hosts, only 1Gb is allocated at the core (question)
        • a saturated network, therefore, cannot run all hosts at full rates
  • Current hot trend is commoditisation
    • Google does this internally, Microsoft/Yahoo/Amazon probably similarly smart about it
      • they've solved it, but either find it too important to share, or don't yet need SIGCOMM papers
    • Nothing is standard. Requires modifications to routing and/or address resolution protocols
      • hacks to L2 or L3 routing
        • L4 protocols generally oblivious
        • need to be careful about not excessively reordering packets
      • non-standard is reasonable for DC's, since internal network open to innovation
    • Main idea is to follow in footsteps of commodity servers
      • From fewer, big, less hackable Sun/IBM/etc boxen to many smaller, hackable i386/amd64 machines running Linux/FreeBSD/something Microsofty
      • Clear win for servers (~45% of DC budget), less so for networks (~15%) [%s from: greenberg, jan '09 ccr]
        • Is 15% large enough to care that much about optimisation (Amdahl strikes again)?
        • Alternatively, is 15% small enough that we can increase it to get features we want (iWARP, full, non-blocking 10GigE bi-section bandwidth, lower latencies, etc)?
    • Similarly, Network Commoditisation => lots of similar, cheaper, simpler building blocks
      • i.e. many cheaper, (near-)identical switches with a single, common data rate
        • Favours Clos (Charles Clos) topologies such as the fashionable "fat-tree", i.e.:
          • Multi-rooted, wide trees with lots of redundancies to spread bandwidth of # of links
          • large number of equal throughput paths between distant nodes
          • switches with equivalent #'s of ports used throughout
          • 6 maximum hops from anywhere to anywhere in the system
          • scales massively
          • does not necessitate faster data rates further up the tree to avoid oversubscription

Fat-Trees

  • Size is defined by a factor k, the number of ports per identical switch in the network
  • 3-level heirarchy:
    • core level ((k/2)^2 switches)
    • pod level (k pods)
      • each pod has 2 internal layers with (k/2 switches/layer => k switches/pod)
    • end host level (k^3/3 total hosts)
  • k

    # hosts

    # switches

    # ports

    host:switch ratio

    host:port ratio

    4

    16

    20

    80

    0.8

    0.2

    8

    128

    80

    640

    1.6

    0.2

    16

    1,024

    320

    5,120

    3.2

    0.2

    32

    8,192

    1,280

    40,960

    6.4

    0.2

    48

    27,648

    2,880

    138,240

    9.6

    0.2

    64

    65,536

    5,120

    327,680

    12.8

    0.2

    96

    221,184

    11,520

    1,105,920

    19.2

    0.2

    128

    524,288

    20,480

    2,621,440

    25.6

    0.2

  • No labels