Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • congestion results in:
    • packet loss if buffers overflow
    • else, increased latency from waiting in line
    • which is worse?
      • if RAMCloud fast enough, occasional packet loss may not be horrible
      • buffering may cause undesired latencies/variability in latency
  • even if no oversubscription, congestion is still an issue
    • e.g. any time multiple flows funnel from several ports to one port (within the network) or host (at the leaves)
      • conceivable in RAMCloud, as we expect to communicate with many different systems
        • e.g.: could be a problem if client issues enough sufficiently large requests to a large set of servers
  • UDP has no congestion control mechanisms
    • connectionless, unreliable protocol probably essential for latency and throughput goals
    • need to avoid congestion how?
      • rely on user to stagger queries/reduce parallelism? [c.f. Facebook]
      • if we're sufficiently fast, will we run into these problems anyhow?
  • balaji's points: buffers don't scale with bandwidth increases
    • simply can't get 2x buffers with similar increase in bandwidth at high end
    • further, adding more bandwidth and keeping a reservation for temporary congestion is better than adding buffers
      • especially for RAMCloud - reduces latency
      • is this an argument against commodity (at least, against a pure commodity fat-tree)?
  • ECN - Explicit Congestion Notification
    • already done by switches - set bit in IP TOS header if nearing congestion, with greater probability as we approach saturation
    • mostly for sustained flow traffic
      • RAMCloud expects lots of small datagrams, rather than flows

"Data Center Ethernet"

  • Cisco: "collection of standards-based extensions to classical Ethernet that allows data center architects to create a data center transport layer that is:"
    • stable
    • lossless
    • efficient
  • Purpose is apparently to buck the trend of building multiple application-specific networks (IP, SAN, Infiniband, etc)
    • how? better multi-tenacy (traffic class isolation/prioritisation), guaranteed delivery (lossless transmission), layer-2 multipath (higher bisectional bandwidth) 
  • A series of additional standards:
    • "Class-based flow control" (CBFC)
      • for multi-tenancy
    • Enhanced transmission selection (ETS)
      • for multi-tenancy
    • Data center bridging exchange protocol (DCBCXP)
    • Lossless Ethernet
      • for guaranteed delivery
    • Congestion notification
      • end-to-end congestion management to avoid dropped frames (i.e. work around TCP congestion collapse, retrofit non-congestion-aware protocols to not cause trouble(question) )

...