[RFC PATCH net] netfilter: flowtable: fix offloaded ct timeout never being extended
Florian Westphal
fw at strlen.de
Wed May 27 00:34:49 PDT 2026
Adrian Bente <adibente at gmail.com> wrote:
[ trimming CCs .. ]
> OpenWrt has recently migrated many platforms to kernel 6.18. On the
> MediaTek platform, which supports hardware network offloading, WiFi
> connections accelerated via the WED path were observed to drop after
> roughly 300 seconds.
>
> After several debugging sessions, assisted by the Claude LLM, the
> problem was narrowed down as follows:
>
> nf_flow_table_extend_ct_timeout() extends ct->timeout for offloaded
> flows using:
>
> cmpxchg(&ct->timeout, expires, new_timeout);
>
> 'expires' comes from nf_ct_expires(ct) and is a relative value, while
> ct->timeout holds an absolute timestamp. The two are never equal, so
> the cmpxchg always fails and the timeout is never extended.
>
> This goes unnoticed for most flows, but a long-lived hardware (WED)
> offloaded flow on MediaTek MT7986 eventually has ct->timeout decay to
> zero, the conntrack entry is reaped and the connection breaks.
>
> Compare against the current ct->timeout value instead.
>
> This patch is sent as RFC: the diagnosis is verified on hardware and
> the fix resolves the drop, but review of the chosen approach is
> welcome.
I guess we need to open-code expires, something like this (not even
compile tested). Also see https://sashiko.dev/#/patchset/20260526060138.3924-1-adibente%40gmail.com
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -506,7 +506,12 @@ static u32 nf_flow_table_tcp_timeout(const struct nf_conn *ct)
static void nf_flow_table_extend_ct_timeout(struct nf_conn *ct)
{
static const u32 min_timeout = 5 * 60 * HZ;
- u32 expires = nf_ct_expires(ct);
+ u32 ct_timeout = READ_ONCE(ct->timeout);
+ s32 expires;
+
+ expires = ct_timeout - nfct_time_stamp;
+ if (expires <= 0) /* already expired */
+ return;
/* normal case: large enough timeout, nothing to do. */
if (likely(expires >= min_timeout))
@@ -524,7 +529,7 @@ static void nf_flow_table_extend_ct_timeout(struct nf_conn *ct)
if (nf_ct_is_confirmed(ct) &&
test_bit(IPS_OFFLOAD_BIT, &ct->status)) {
u8 l4proto = nf_ct_protonum(ct);
- u32 new_timeout = true;
+ u32 new_timeout = 1;
switch (l4proto) {
case IPPROTO_UDP:
@@ -549,7 +554,7 @@ static void nf_flow_table_extend_ct_timeout(struct nf_conn *ct)
*/
if (new_timeout) {
new_timeout += nfct_time_stamp;
- cmpxchg(&ct->timeout, expires, new_timeout);
+ cmpxchg(&ct->timeout, ct_timeout, new_timeout);
}
}
More information about the Linux-mediatek
mailing list