Re-visit Forwarding Speed in ER-X

In a previous post, I did quick tests to show maximum packet switching and forwarding performance in Edgerouter X. That gave a glimpse but left more open questions in my mind. I believe same to sophisticated users.

I figured EdgeOS is surprisingly easy to re-configure. I thought it would be interesting to re-purpose eth1 of my LAN-2 to be a secondary WAN. With a second thought, I decided to take eth2 out of switch0 and set it up as the secondary WAN. The overall configuration after the change provides two local networks, a primary WAN (that continuously serve the internet if I need access during the testing), and one additional WAN for running benchmarks.

Test Equipment and Wiring

Two client PCs - one 2015 13" Macbook Pro and one 2011 27" iMac. The MBP does not come with a Gigabit Ethernet NIC. I have a Unitek USB 3.0 to GbE Adaptor. This adaptor comes with a RealTek chipset and performs really well. It's used to connect MBP to ER-X. The iMac comes with a built-in GbE (Broadcom NetXtreme chipset).

*My home LAN (partial) and Wiring of Test Equipment*

The above illustration, in other words, says these:

  • eth0 is always an independent port and the primary WAN to my ISP
  • eth1 is also an independent port and the secondary LAN, LAN-2
  • eth2 is an independent port when configured as the secondary WAN. A member of switch0 when not as WAN.
  • switch0 is my primary LAN, LAN-ukei and always have members eth3 and eth4. It has eth2 as a member when eth2 is not the secondary WAN.
  • eth4 connects to the WAN port on ASUS RT-AC56U (in AP mode).

At any given time Internet service was not interrupted during testing. So it might have minute impact on test results. Realistically though as I kept Internet activity to minimum during testing, the impact is trivial if not negligible.

The iMac always connects to eth3 throughout the tests. The Mackbook Pro's wiring depends on test scenarios. Here is a summary. MBP connects (through USB3 to GbE adaptor) to:

  • eth1 for routing benchmark across two networks,
  • eth2 for routing benchmark with NAT, firewall and QoS where eth2 configured as the secondary WAN, or
  • eth2 for switching benchmark where eth2 configured as a member of switch0

Configure ER-X

My primary LAN, LAN-ukei, has network 192.168.1.0/24. Secondary LAN, LAN-2, has network 192.168.10.0/24.

When eth2 is configured as the secondary WAN, it has network 192.168.20.0/24. In such case, eth2 itself is statically assigned 192.168.20.1. Its remote gateway, i.e. the MBP, is statically assigned 192.168.20.2.

To re-configure eth2 as a member of switch0 or as an independent port for the secondary WAN, the very first step is to include/exclude the port from swithc0. Go to Dashboard and click Actions next to switch0 and then select config. Uncheck (to exclude) or check (to include) eth2 as a member of the switch:

*Uncheck to exclude eth2 from switch0 (left). Assign network 192.168.20.0/24 to eth2 (right)*

When unchecked (and remember to save it), eth2 becomes an independent port. Back to Dashboard, click Actions next to eth2 and assign network, 192.168.20.0/24.

Now we can proceed to setup NAT, firewall and QoS on eth2. The GUI provides a "copy" function which duplicates existing NAT or firewall rulesets (and rules), and applies to a new WAN interface. Very convenient. For QoS, add a new Smart Queue and select eth2 as WAN interface. All operations straight forward and can do through GUI. So I'll skip the boring steps. To restore eth2 being a member of switch0, use GUI to undo the above steps.

Quick tips: to delete a firewall ruleset, unselect the associated interface first and save, then you'll be able to delete the ruleset. Otherwise, EdgeOS will complain the ruleset in use and cannot be deleted.

To summarise the setup of NAT, firewall and QoS of the secondary WAN:

  • an exact copy of the primary WAN's config
  • one NAT rule that does source address translation aka masquerade in Linux
  • two firewall rulesets: one for forwarding and one for input to localhost. A total of 14 rules.
  • QoS only turned on for upload traffic. Uses fq_codel+HTB aka Smart Queue in EdgeOS.

There are two reasons for only upload QoS. First, throughput with and without QoS can be measured conveniently without config change. Second, it's my preference and mirrors my production setup.

Note that HWNAT, so called colloquially, is always enabled in my ER-X. Another name for HWNAT in ER-X or the MediaTek SoC is IP offload (in the sense packet processing offloaded from CPU). IP offload in ER-X speeds up all routed traffic regardless NAT'ed or not.

EdgeOS v1.9.0 is used in the tests.

Tests and Results

I used iperf3 in TCP mode to benchmark throughput. MBP runs as iperf3 server. The iMac initiates send or receive dependent on test scenarios. With large size packets, iperf3 can saturate bandwidth with a single stream. With smaller size packets, usually multiple iperf3 (-P option) streams are required to achieve maximum throughput. Number of streams were incremented until I saw stale throughput or CPU was fully utilised. I'll explicitly point out how many streams were used if more than one was needed.

Same benchmark was repeated at three different packet sizes: 1460 bytes, 512 bytes and 64 bytes. Usually vendors will quote similar packet sizes in their router performance sheets.

Note that in my tests, the packet sizes are actually the Maximum Segment Size (MSS). To get the Maximum Transmit Unit (MTU), add an additional 40 bytes. Also note that Mbit per second is based on MSS payloads as reported by iperf3. So if you prefer bandwidth in MTU, use this conversion:

  • Mbit per second in MTU = (packet size in MSS + 40) x 8 x packet per second

Switch Performance

*Switching Performance*

Throughput of large (1460 bytes) packets is wire speed. For medium (512 bytes) packets, ER-X also demonstrates very good throughput. Not sure if it's exceptionally better than consumer routers. People usually don't test at such granularity on all-in-one's and their throughput numbers generally mean for large packets only.

At small (64 bytes) packets, I used six iperf3 streams to get maximum throughput. A single stream hits 150-ish Mbit/s. The difference isn't categorical. Here we can see the MediaTek SoC's peak packet processing performance in ER-X. That peak from my tests is around 320 Kpps.

CPU utilisation is 0% in all three cases.

Forwarding (without NAT, FW or QoS)

*Forwarding without NAT, FW and QoS*

Impressive performance here. For all packet sizes, throughput is wire speed of the switch. Note that packet forwarding requires going through Linux kernel and HWNAT ASIC. The numbers show HWNAT in the MediaTek SoC performs very well, seamlessly glued to kernel's TCP/IP stack.

For small packets, I used six iperf3 streams but again no categorical difference from a single stream. At 308 Kpps, it's 14 Kpps short of the switch performance. I'm not sure if it's a statistical error. I only realised it when doing data analysis a few days after test data was captured.

Forwarding (with NAT, FW and QoS)

*Forwarding with NAT, FW and QoS*

Download

For all downloads, impressive numbers. Note that I don't have QoS applied to download traffic. Throughput is wire speed of the switch. Again packet forwarding goes through kernel stack as well as HWNAT.

At small packets, I used six iperf3 streams to get maximum throughput. A single stream produced about 70Mbit/s. Here we can see a categorical difference with multiple streams. Worth further analysis on another day. My gut feeling on the difference is due to QoS applied on upload. Bear in mind that even though the bulk of traffic on download, the number of ACK packets on upload direction is non-trivial here.

Also worth calling out the higher CPU utilisation. HWNAT should impose little to negligible CPU use. But the ACK packets in upload direction do. The CPU utilisation should be mainly contributed by QoS on upload direction.

Upload and QoS Performance

With QoS applied on upload, I expected throughput to take a hit. ER-X still surprised me here. The 880 MHz dual-core four-thread MIPS CPU inside the MediaTek SoC can push between 38-46 Kpps with fq_codel+HTB QoS.

At large packet size, 541 Mbit/s throughput at slightly more than half of the wire speed. That's very good. Even at medium packet size, 183 Mbit/s is more than comfortably usable for many applications and situations. At small packet size, ER-X take a serious hit on throughput. Put in perspective though average packet size of typical traffic shall be between 512 to 1500 bytes.

Number of streams to get maximum throughput in the three upload tests are summarised below. The table also shows the QoS performance on ER-X if we look at it from a different perspective.

*Smart Queue QoS Performance*

I can see the pattern here. QoS in Linux is a serialised process. Little parallelism in packet processing can be exploited on a single connection. But at application level with multiple concurrent connections, the potential of ER-X can be used to a fuller extent (e.g., browsing and BitTorrent, multiple users accessing Internet at the same time).

Conclusion

Based from my tests, ER-X is capable of gigabit WAN as long as I use HWNAT (no reason not to use in my opinion). Expect usable throughput seen by applications somewhere between 825 Mbit/s and 925 Mbit/s. Once QoS applied, throughput take a hit to half and could be as low as 1/50 in the extreme case (64 byte packets). To be fair, rarely network traffic falls in the extreme case of tiny packets for a prolonged period at SOHO/home.

I also have little doubt in SOHO/home users. Do they have real applications that can sustain even 10% of gigabit WAN at any given time? Nor I believe in such environment at that speed people will need QoS. Users practically benefit from the lower latency and faster response rather than fully use of the bandwidth. For causal use, ER-X shall perform satisfactorily on gigabit WAN.

For SME environment (say >20 users) or SOHO users with serious applications that utilize gigabit bandwidth, I would invest on a more capable router. Not that ER-X will fail at 30 users. I wager ER-X may fair well with 100 users. The other day I heard some people deployed Asus routers (WiFi disabled) in medium size business with 100 users. No way ER-X with newer, leaner and meaner EdgeOS will fail if not perform a more satisfactory job.

TL;DR ER-X is not for 1000/1000 Mbit/s WAN. For casual use, performance shall be satisfactory. At 500/500 Mbit/s WAN or below, ER-X surely has no problem and excels.

comments powered by Disqus