Practicing the art of traffic shaping

Traffic shaping is one of those topics that most people find mystifying when they attempt to research it. I’ve been doing traffic shaping work for years and I still find myself learning new things. I thought with some recent work I’d write down a few notes on practical setups for Linux using the tc (traffic control) command. I also want to highlight some common issues people need to address when designing traffic control systems.

Problems with PFIFO_FAST

If there’s one thing that consistently surprises me it’s that the default Linux traffic shaper (PFIFO_FAST) actually works pretty well in a number of situations. The default shaper is composed of a PRIO queue with 3 bands. Traffic is moved to the 3 bands based on the TOS field in the packet. The queue works by processing each band by priority. If there are any packets in the first queue they will be sent before any in the second queue, and likewise packets in the second queue are always sent before those in the third. So here are the main issues with this approach:

  • High latency from starvation - Because of the way the PRIO queue works if traffic is consistently sent to the first queue, packets to lower queues will acquire latency, be dropped, or queued indefinitely. A consistent way to observe this on a GNU/Linux system is to run rsync over ssh which will send large amounts of data in the highest priority queue. This also can be used to cause a Denial of Service attack if an attacker succeeds in getting high priority traffic to send consistently thus starving other traffic. Now, in the default shaper this is somewhat mitigated from complete starvation or denial of service by the fact that the connection is often very fast. For a WAN connection the computer is often sending packets quickly to the modem / router so starvation doesn’t happen as frequently.
  • High latency from packet queuing - Sending packets quickly to a modem or router has the effect of causing those packets to be queued. These queues then cause higher latency as the modem / router struggles to send data over a WAN link. To prevent this a shaper has to be aware of the bandwidth of the WAN link so that packets can be queued and prioritized on the GNU/Linux system.
  • No differentiation between LAN and WAN traffic - In the vast majority of cases LAN traffic can be processed faster than WAN traffic. An optimal shaper should differentiate this traffic to keep WAN traffic from slowing LAN traffic.

Most common mistake

In an attempt to improve upon these problems it’s first useful to know what common mistake to avoid:

  • Do not throttle the bandwidth of a device to the WAN speed - This is worse for some users than others but in all cases it is suboptimal. Even if there is only 1 computer connected through broadband this setting prevents the computer from talking to the modem / router at the full speed of the interface which is often 1-2 orders of magnitude faster than the WAN speed. Plus, any local traffic will be deducted from the available WAN bandwidth thus slowing the speed of the WAN.

The root queuing discipline

I have found that the best root queue is a PRIO with 3 bands designated as follows:

  • PRIO
    1. Reserved for testing - almost never used
    2. WAN traffic, throttle to LAN speed
    3. LAN traffic, optionally throttled to interface speed

With a PRIO queue each band has to have another queuing disciple beneath it. Here are the options I tend to use:

  • PRIO
    1. Reserved - since this is rarely used PFIFO is fine
    2. WAN - to be discussed below but generally HFSC or HTB depending on needs
    3. LAN - SFQ is often fine but PFIFO, PFIFO_FAST, TBF, and possibly even RED are also worth considering

Handling WAN traffic

When deciding how to handle WAN traffic in a shaping system it is important to decide whether it is necessary to have bandwidth reserved for certain types of traffic. That is, does the system require that 80kbps always be available for VOIP traffic and 250kbps always be available for serving web traffic or not.

Reserving bandwidth and latency

If the answer is yes then the obscure HFSC queuing discipline should be chosen for the WAN traffic. HFSC is notable because it can guarantee latency as well as bandwidth making is more suitable for applications, like VOIP, that are particularly sensitive to latency. However, if latency is not a concern at all HTB could be used as it is far better documented.

The combination of PRIO + HFSC for WAN traffic is what I have deployed on a number of servers. It successfully addresses all of the problems of the PFIFO_FAST queuing discipline. The only downside to the approach is its complexity compared to PFIFO_FAST. Now only does the bandwidth need to be known but all of the reservations of bandwidth and latency need to be determined and tested which is no easy feat. It also requires much more complex iptables or tc filter rules to classify the traffic into the appropriate bands.

  • PRIO

    1. PFIFO - Reserved
    2. HFSC - LAN Traffic, bandwidth limited
      • Various channels which are usually SFQ or PFIFO
    3. SFQ - WAN Traffic

Relying on priorities

I was told that a number of people were using BSD for traffic shaping of VOIP applications. I have not verified if this claim is accurate but I decided to see how traffic shaping was being configured on BSD systems to handle VOIP applications. What I found was that the systems often relying on limiting the WAN rate and just putting the equivalent of a PRIO queue behind that. I decided to create an equivalent setup on a GNU/Linux system. Unfortunately there is not a simple rate limiter so I chose to use HTB with 1 band with full bandwidth and place a PRIO queue onto that.

This is still an experimental setup that I have not tested thoroughly enough to recommend but the principle is sound. And I can verify that VOIP works very well, unless there is other traffic in the high priority band. This configuration solves the latter 2 issues with PFIFO_FAST but still allows for starvation and suffers from problems with rsync over ssh and similar traffic patterns. However, this may be reasonably addressed by altering the TOS field of packets to prevent such problems. I also wonder if HFSC might be a better choice than HTB or if PFIFO_FAST might be a better choice (for simplicity) than PRIO beneath HTB.

  • PRIO
    1. PFIFO - Reserved
    2. HTB - LAN Traffic, bandwidth limited
    3. PRIO
      1. PFIFO
      2. SFQ
      3. SFQ
    4. SFQ - WAN Traffic

WAN Considerations

I have often chosen SFQ for all WAN traffic, without any bandwidth limitation but that is often because I’m dealing with small LAN’s where traffic prioritization is not an issue. If I were transmitting VOIP across a LAN then PFIFO_FAST might be a better choice. It also may be worthwhile to limit the bandwidth.

Resources

Creative Commons License Except where otherwise noted, content on this site is licensed under a Creative Commons by-nc-sa 3.0 License