[PATCH] net: stmmac: prevent kernel panic during XDP program and XSK pool transitions

Jakub Kicinski kuba at kernel.org
Tue Jun 9 17:38:04 PDT 2026


On Fri, 05 Jun 2026 00:56:35 -0700 Carlos Fangmeier wrote:
> stmmac_xdp_set_prog() tears down and rebuilds all DMA channels via
> stmmac_xdp_release()/stmmac_xdp_open() without pausing the netdev
> TX path. Similarly, stmmac_xdp_enable_pool() and
> stmmac_xdp_disable_pool() reconfigure individual queue DMA rings
> while TX remains active.

This still looks racy. Please look at other drivers

> If the kernel transmits a frame during these windows — for example an
> MLD report queued by the IPv6 stack — stmmac_xmit() calls
> dwmac4_set_addr() against an MMIO register whose mapping has been
> torn down, triggering a level-3 translation fault:
> 
>   Unable to handle kernel paging request at virtual address ffff8000840ec000
>   pc : dwmac4_set_addr+0x8/0x18
>   lr : stmmac_xmit+0x64c/0xb60
>   Call trace:
>    dwmac4_set_addr+0x8/0x18
>    dev_hard_start_xmit+0xb0/0x220
>    sch_direct_xmit+0x108/0x3f0
>    __dev_queue_xmit+0x844/0xd00
>    ip6_finish_output2+0x2d8/0x610
>    mld_sendpack+0x180/0x2e0
>    mld_ifc_work+0x1dc/0x480
> 
> The existing netif_tx_disable() in stmmac_xdp_release() is not
> sufficient because stmmac_xdp_open() re-enables TX via
> netif_tx_start_all_queues() before the caller regains control, leaving
> a window where the freshly rebuilt rings can race with pending TX work.
> 
> Fix this by wrapping each reconfiguration path with
> netif_tx_disable()/netif_tx_wake_all_queues():
> 
>  - stmmac_xdp_set_prog(): hold TX disabled across the full
>    stmmac_xdp_release() + stmmac_xdp_open() sequence, only waking
>    TX after stmmac_xdp_open() returns.
> 
>  - stmmac_xdp_enable_pool(): disable TX before tearing down the
>    queue, re-enable after the queue is rebuilt and NAPI is active.
> 
>  - stmmac_xdp_disable_pool(): same pattern around the pool teardown
>    and queue rebuild.
> 
> Tested on Cortex-A55 (stmmac/dwmac4, kernel 6.6.60) with AF_XDP

That's an ancient kernel, you'll have to test an upstream kernel
for us to merge the patch
-- 
pw-bot: cr



More information about the linux-arm-kernel mailing list