[LEDE-DEV] A state of network acceleration

Rafał Miłecki zajec5 at gmail.com
Wed Jan 17 07:25:10 PST 2018

Getting better network performance (mostly for NAT) using some kind of
acceleration was always a hot topic and people are still
looking/asking for it. I'd like to write a short summary and share my
understanding of current state so that:
1) People can undesrtand it better
2) We can have some rough plan

First of all there are two possible ways of accelerating network
traffic: in software and in hardware. Software solution is independent
of architecture/device and is mostly just bypassing in-kernel packets
flow. It still uses device's CPU which can be a bottleneck. Various
software implementations are reported to be faster from 2x to 5x.
Hardware acceleration requires hw-specific implementation and can
offload device's CPU.

Of course handling network traffic out of the networking subsystem
means some features like QoS / throughput limits / advanced firewall
rules may not/won't work.

The hardest task (for both methods) was always a Linux kernel
integration. Drivers had to somehow:
1) Get/build a table with rules for packets flow
2) Update in-kernel state to e.g. avoid connection timeout & its removal

The problem with all existing implementations was they used various
non-upstream patches for kernel integration. Some were less invasive,
some a bit more. They weren't properly reviewed by kernel developers
and usually were using hacks/solutions that couldn't be accepted.

The rescue to this was Pablo's work on offloading infrastructure. He
worked on this hard by developing & sending his patchset for upstream
[1] [PATCH RFC,WIP 0/5] Flow offload infrastructure
[2] [PATCH nf-next RFC,v2 0/6] Flow offload infrastructure
[3] [PATCH nf-next,v3 0/7] Flow offload infrastructure

The best news is that his final patchset version was accepted and sits
now in the net-next [4] (and should become part of kernel 4.16).

Now, what does it mean for LEDE project:
1) There is upstream infrastructure that should be ready to use
2) It's based on & requires nftables
3) LEDE's firewall3 uses iptables (& friends) C API
4) There aren't any drivers for offloading hardware (switches?) yet

One thing I'm not sure about is if software accelerator is ready or not.
Pablo is his e-mail wrote:
> So far, this is a generic software flow table representation, that
> matches basic flow table hardware semantics but that also provides a
> software faster path. So you can use it to purely forward packets
> between two nics even if they come with no hardware offload support.

which could suggest software path is already there.

So there is my idea of what is needed by LEDE to get it working:
1) Rewrite firewall3 to use nftables
2) Switch to kernel 4.16 or backport offloading to 4.14
3) Work on implementing/enabling software acceleration path

Let me know if above description makes sense to you or correct me if
you think I misunderstood something :)

[1] https://www.spinics.net/lists/netfilter-devel/msg50141.html
[2] https://www.spinics.net/lists/netfilter-devel/msg50555.html
[3] https://www.spinics.net/lists/netfilter-devel/msg50759.html
[4] https://www.spinics.net/lists/netfilter-devel/msg50973.html

More information about the Lede-dev mailing list