...
- congestion results in:
- packet loss if buffers overflow
- else, increased latency from waiting in line
- which is worse?
- if RAMCloud fast enough, occasional packet loss may not be horrible
- buffering may cause undesired latencies/variability in latency
- even if no oversubscription, congestion is still an issue
- e.g. any time multiple flows funnel from several ports to one port (within the network) or host (at the leaves)
- conceivable in RAMCloud, as we expect to communicate with many different systems
- e.g.: could be a problem if client issues enough sufficiently large requests to a large set of servers
- conceivable in RAMCloud, as we expect to communicate with many different systems
- e.g. any time multiple flows funnel from several ports to one port (within the network) or host (at the leaves)
- UDP has no congestion control mechanisms
- connectionless, unreliable protocol probably essential for latency and throughput goals
- need to avoid congestion how?
- rely on user to stagger queries/reduce parallelism? [c.f. Facebook]
- if we're sufficiently fast, will we run into these problems anyhow?
- balaji's points: buffers don't scale with bandwidth increases
- simply can't get 2x buffers with similar increase in bandwidth at high end
- further, adding more bandwidth and keeping a reservation for temporary congestion is better than adding buffers
- especially for RAMCloud - reduces latency
- is this an argument against commodity (at least, against a pure commodity fat-tree)?
- ECN - Explicit Congestion Notification
- already done by switches - set bit in IP TOS header if nearing congestion, with greater probability as we approach saturation
- mostly for sustained flow traffic
- RAMCloud expects lots of small datagrams, rather than flows
"Data Center Ethernet"
- Cisco: "collection of standards-based extensions to classical Ethernet that allows data center architects to create a data center transport layer that is:"
- stable
- lossless
- efficient
- Purpose is apparently to buck the trend of building multiple application-specific networks (IP, SAN, Infiniband, etc)
- how? better multi-tenacy (traffic class isolation/prioritisation), guaranteed delivery (lossless transmission), layer-2 multipath (higher bisectional bandwidth)
- A series of additional standards:
- "Class-based flow control" (CBFC)
- for multi-tenancy
- Enhanced transmission selection (ETS)
- for multi-tenancy
- Data center bridging exchange protocol (DCBCXP)
- Lossless Ethernet
- for guaranteed delivery
- Congestion notification
- end-to-end congestion management to avoid dropped frames (i.e. work around TCP congestion collapse, retrofit non-congestion-aware protocols to not cause trouble )
- "Class-based flow control" (CBFC)
...