Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  

Open Desing Questions

  • Open Questions: How hot spots will affect the result. So far we always simulated evenly distributed traffic patterns and at some point we should try to see how things would be different when we have receiver hot spots. For example in Homa, it is quite possible that receiver has to throttle the unsched bytes limit to less than RTT if it is overloaded.

  • Delay Variability: So far we have focused 1RTT cap however we know delay variations happen in the network and we need to take that into account. Furthermore, RTT is assumed to be the minimum RTT possible that is when a full 1538B packet is sent in one direction and a small 84Bytes (min Ethernet Frame Size) is send in the other direction. However, in high load, average RTT can be much higher than this value and forcing the min RTT as the cap can lead to wasted bw. So the question is at high load and high delay variability, how the RTT value should be calculated?

  • Higher link speed: In what way going to next generation networks fabric with 40gb/s NIC links and 100gb/s fabric links will affect our results.

  • Adapting to link/switch failures: We have assumed a full bisection bandwidth network but how should Homa detect and react to links and switches that might temporarily fail and reduce the bisection bw in some parts of the network.

  

Voodoo Constants

  •  The round-trip time
  • The total number of available priorities
  • How should we divide the prios among the folowing packet types
    • Control packets (ie. grants)
    • Unscheduled packets (priority cut off among different message sizes)
    • Scheduled packets
    • Low priority redundant preemptive scheduled packets to avoid bubbles

Homa Paper ToDos

  1. Simulation comparison points ordered by importance

    1. self comparison when Homa features are removed

    2. pHost

    3. pFabric

    4. PIAS

    5. DCTCP

  2. Lower level measurement and analysis

    1. wasted bandwidth at receiver

    2. cumulative time average of priority usages

    3. time series of outstanding messages

  3. Algorithmic shortcomings

    1. we need not to use priorities for preempting large messages

    2. how should we pick parameters?

    3. need a module that is able to measure message size distribution

    4. how many priorities do we use and how many messages do we keep granted?