Smart Queue is a traffic control solution that EdgeOS calls its knob-less QoS, using fq_codel and HTB underneath. It can be applied to upload or download or both. Knob-less is not exactly correct. There are still a few parameters exposed through GUI for each direction:
- Rate - the WAN bandwidth
- ECN - yes/no
- FQ_CODEL Quantum
- HTB Quantum
As a starting point, I enabled upload and download. Filled in the bandwidth values, 100 Mbit/s symmetric WAN. The other parameters went with EdgeOS default. Note that you will only see the default values after you have applied the configuration. A little defect.
Dslreports Buffer Bloat Test
People have different opinion regarding buffer bloat test by Dslreports. It is convenient, so I used it for my quick tests. My very first casual runs received triple A+ which made me happy and cheerful for ER-X (for a day or two).
Up to that point I did not understand what was happening behind the scene. For example, I found out later Dslreports test page cannot properly detect geo-location when one is on dual stacks. It determines from the IP address loading its test page. My IPv6 address hitting its page was identified as US which was correct because it was from my Hurricane Electric's 6in4 tunnel. Worst still I thought it proceeded to test over my 6in4 tunnel. That was not true. It always begins test through IPv4 until one explicitly changes to IPv6 through the preferences page.
So my very first runs were really a miracle getting A+ on buffer bloat. Or was it?
Looking closer into its scoring system, A+ on buffer bloat means the average variation in lag during both download and upload is within 5ms from the average idle lag. That in other words is the round-trip time between the test servers and my PC when the line is not busy. I got average idle lag around 150ms during the test. Having scored A+ means Smart Queue has managed to control the average lag during download and upload within 145ms-155ms.
That turns out not quite a challenge for Smart Queue.
To avoid the pitfalls in dslreports, I registered an account and created one test profile which is:
- one fixed test server nearest to me (900km away)
- IPv4, 12 upstream and 12 downstream
- do not auto detect geo-location
- high-resolution buffer bloat timer
Ran this test profile a few times. I consistently scored A for buffer bloat. From the scoring system, it meant average lag differed from idle between 6ms and 20ms. Let's take a look at the actual lags:
For a given average idle lag ~20ms, Smart Queue managed the upload very well in the test. Download was less desirable. Where did the extra 20ms come from?
Cut Throat Forwarding
My test PC connects to ER-X through Asus RT-AC56U. The Asus acts as an access point. If I plug my test PC directly to ER-X, the download lag was ~20ms on par with upload lag and idle lag. Though jitters in download lag are still higher than that of upload. With this extra lag eliminated, I consistently score A+.
Turned out I had cut through forwarding (CTF) switched on the Asus AP. While CTF is supposed to speed up traffic it seems adding extra delay under load. It was quite a revealing moment when I found out about this.
Dslreports has changed the graphing style since first test. But we can see a drop in download lag with CTF off.
Smart Queue Tuning
- Limit = 1514
- HTB Quantum = 20000
These are the only two parameters I change. HTB Quantum defaults to 200,000 which causes a warning logged to syslog. The consensus recommendation is 8,000 or above for any WAN faster than 100 Mbit/s. But no more precise than that. In kernel, HTB Quantum is capped at 60,000. Contrary to that, EdgeOS caps at 65,535 on its GUI.
Limit defaults to ten times that of Flows. But I read elsewhere recommending to keep it same as FQ_CODEL Quantum.
My test connection is like this: PC <> RT-AC56U (AP) <> ER-X <> WAN. I have these tcp/ip parameters tuned on RT-AC56U:
echo 50000 > /proc/sys/net/ipv4/tcp_max_tw_buckets echo 1048576 > /proc/sys/net/core/wmem_max echo 1048576 > /proc/sys/net/core/rmem_max echo 1000 > /proc/sys/net/core/netdev_max_backlog echo 4096 87380 2038400 > /proc/sys/net/ipv4/tcp_rmem echo 4096 87380 2038400 > /proc/sys/net/ipv4/tcp_wmem echo 1024 > /proc/sys/net/ipv4/tcp_max_syn_backlog
Here is the part which still puzzles me. On a dumb access point, LAN ports and WLAN interfaces all should work at layer 2. CTF and tcp/ip parameters should not have effect. Right? My RT-AC56U is not that dumb because I also want it to run a few daemons including OpenVPN server. So I enable both IPv4 and v6 forwarding. Will this have impact (through bugs or otherwise) on layer 2 traffic? Worth a second look in future.
Test After Tuning
Note that at different time, network latency or idle lag varies between the cherry picked test server and my PC. The essence is the relative deviation from idle lag. From the above charts, we could see both download and upload average lags around 30ms same as the idle lag. Smart Queue worked great after the change! Dslreports handed me a score of quadruple A+:
Okay, I cheated on the speed grade. I told Dslreports my WAN speed 85 Mbit/s and then asked her to re-grade. I'm being pragmatic as the test server far away
With Smart Queue enabled and WAN speed saturated, CPU utilisation on my ER-X was about 35% during the tests. Very nice indeed. I just learned from a EdgeOS developer something even more sweet for better efficiency.
Let's leave it for another day.