[PATCH net-next v3 00/13] net: lan966x: add support for PCIe FDMA
Daniel Machon
daniel.machon at microchip.com
Mon May 4 07:23:13 PDT 2026
When lan966x operates as a PCIe endpoint, the driver currently uses
register-based I/O for frame injection and extraction. This approach is
functional but slow, topping out at around 33 Mbps on an Intel x86 host
with a lan966x PCIe card.
This series adds FDMA (Frame DMA) support for the PCIe path. When
operating as a PCIe endpoint, the internal FDMA engine on lan966x cannot
directly access host memory, so DMA buffers are allocated as contiguous
coherent memory and mapped through the PCIe Address Translation Unit
(ATU). The ATU provides outbound windows that translate internal FDMA
addresses to PCIe bus addresses, allowing the FDMA engine to read and
write host memory. Because the ATU requires contiguous address regions,
page_pool and normal per-page DMA mappings cannot be used. Instead,
frames are transferred using memcpy between the ATU-mapped buffers and
the network stack. With this, throughput increases from ~33 Mbps to
~620 Mbps for default MTU.
Patch 1 adds the shared drivers/net/ethernet/microchip/fdma/ directory
to the Sparx5 SoC MAINTAINERS entry.
Patches 2-3 prepare the shared FDMA library: patch 2 renames the
contiguous dataptr helpers for clarity, and patch 3 adds PCIe ATU
region management and coherent DMA allocation with ATU mapping.
Patches 4-6 refactor the lan966x FDMA code to support both platform
and PCIe paths: extracting the LLP register write into a helper,
exporting shared functions, and introducing an ops dispatch table
selected at probe time.
Patches 7-8 harden the existing FDMA path for the PCIe endpoint
lifecycle: patch 7 clears latched FDMA error/interrupt stickies after
the switch reset so they don't assert as soon as interrupts are
enabled, and patch 8 adds a shutdown() callback that quiesces the
FDMA engine on host warm reboot (on the PCIe card the FDMA survives
host reset and would otherwise keep the shared INTx asserted into
the next probe).
Patch 9 adds the core PCIe FDMA implementation with RX/TX using
contiguous ATU-mapped buffers. Patches 10 and 11 extend it with MTU
change and XDP support respectively. XDP_PASS, XDP_TX, XDP_DROP and
XDP_ABORTED are supported; XDP_REDIRECT is deliberately not, because
the PCIe data path does not use page_pool.
Patches 12-13 update the lan966x PCI device tree overlay to extend the
cpu register mapping to cover the ATU register space and add the FDMA
interrupt.
To: Andrew Lunn <andrew+netdev at lunn.ch>
To: David S. Miller <davem at davemloft.net>
To: Eric Dumazet <edumazet at google.com>
To: Jakub Kicinski <kuba at kernel.org>
To: Paolo Abeni <pabeni at redhat.com>
To: Horatiu Vultur <horatiu.vultur at microchip.com>
To: Steen Hegelund <steen.hegelund at microchip.com>
To: UNGLinuxDriver at microchip.com
To: Alexei Starovoitov <ast at kernel.org>
To: Daniel Borkmann <daniel at iogearbox.net>
To: Jesper Dangaard Brouer <hawk at kernel.org>
To: John Fastabend <john.fastabend at gmail.com>
To: Stanislav Fomichev <sdf at fomichev.me>
To: Herve Codina <herve.codina at bootlin.com>
To: Arnd Bergmann <arnd at arndb.de>
To: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
To: Mohsin Bashir <mohsin.bashr at gmail.com>
Cc: netdev at vger.kernel.org
Cc: linux-kernel at vger.kernel.org
Cc: bpf at vger.kernel.org
Cc: linux-arm-kernel at lists.infradead.org
Signed-off-by: Daniel Machon <daniel.machon at microchip.com>
---
Changes in v3:
Version 3 fixes a number of issues reported by sashiko - mostly
hardening.
- Fix double use of XDP_PACKET_HEADROOM.
- Fix ERR_PTR persistence in fdma->atu_region and add missing
NULL/ERR_PTR guard in fdma_pci_atu_region_unmap().
- Reject size <= 0 in fdma_pci_atu_region_map() and return
-ENOSPC (was -ENOMEM) when no region is free.
- Introduce lan966x_fdma_pci_tx_size_fits() that accounts for
XDP_PACKET_HEADROOM; use it from both xmit paths to keep
bpf_xdp_adjust_tail from writing past the TX slot.
- Validate BLOCKL in rx_check_frame() (reject < IFH+FCS or
> db_size) before it feeds memcpy/XDP sizes.
- READ_ONCE(port->xdp_prog) inside lan966x_xdp_pci_run() to close
a TOCTOU on XDP detach that could deref NULL in
bpf_prog_run_xdp().
- Strip IFH and FCS pre-XDP in rx_check_frame(). After BPF runs
the driver cannot tell whether the tail was modified; drop the
unconditional skb_pull/skb_trim in rx_get_frame().
- Account tx_bytes/tx_packets on XDP_TX success and tx_dropped on
XDP_TX size reject.
- Add dma_wmb()/dma_rmb() around DCB status writes and reads in
xmit, xmit_xdpf, and napi_poll.
- Collected Tested-by: Hervé Codina.
- Link to v2: https://lore.kernel.org/r/20260428-lan966x-pci-fdma-v2-0-d3ec66e06202@microchip.com
Changes in v2:
Version 2 primarily addresses issues with module unload/load, where
traffic would stop working (Hervé), and XDP head/tail adjust that would be
discarded (Mohsin).
Apart from that, I ran through issues reported by Sashiko, and fixed a
number of other issues.
- New patch 1: add drivers/net/ethernet/microchip/fdma/ to the Sparx5
SoC MAINTAINERS entry.
- New patch 7: clear latched FDMA error/interrupt stickies after the
switch reset so they don't fire as soon as interrupts are enabled.
- New patch 8: shutdown() callback, quiescing FDMA on host warm reboot.
- Replaced the depth-2 dev_is_pci(parent->parent) backend selector
with a parent-chain walk.
- XDP: use xdp.data/xdp.data_end for the post-XDP frame length so that
bpf_xdp_adjust_head/tail are respected (Mohsin Bashir)
- MTU change: drain in-flight xmits with netif_tx_disable() on every
port before reallocating rings, waking them again on completion.
- MTU change: cap the PCIe DCB ring at 256 entries so a full-ring
coherent DMA allocation fits in a single MAX_PAGE_ORDER block at
jumbo MTU.
- PCIe ATU: disable the region before clearing its translation on
unmap.
- PCIe FDMA: hold tx_lock in napi_poll around the free-DCB check used
to wake stopped netdev queues.
- PCIe FDMA: return -ENOSPC (not -1) when the DCB ring is exhausted.
- Link to v1: https://lore.kernel.org/r/20260320-lan966x-pci-fdma-v1-0-ef54cb9b0c4b@microchip.com
---
Daniel Machon (13):
MAINTAINERS: add FDMA library to Sparx5 SoC entry
net: microchip: fdma: rename contiguous dataptr helpers
net: microchip: fdma: add PCIe ATU support
net: lan966x: add FDMA LLP register write helper
net: lan966x: export FDMA helpers for reuse
net: lan966x: add FDMA ops dispatch for PCIe support
net: lan966x: clear FDMA interrupt stickies after switch reset
net: lan966x: add shutdown callback to stop FDMA on reboot
net: lan966x: add PCIe FDMA support
net: lan966x: add PCIe FDMA MTU change support
net: lan966x: add PCIe FDMA XDP support
misc: lan966x-pci: dts: extend cpu reg to cover PCIE DBI space
misc: lan966x-pci: dts: add fdma interrupt to overlay
MAINTAINERS | 1 +
drivers/misc/lan966x_pci.dtso | 5 +-
drivers/net/ethernet/microchip/fdma/Makefile | 4 +
drivers/net/ethernet/microchip/fdma/fdma_api.c | 33 ++
drivers/net/ethernet/microchip/fdma/fdma_api.h | 25 +-
drivers/net/ethernet/microchip/fdma/fdma_pci.c | 182 ++++++
drivers/net/ethernet/microchip/fdma/fdma_pci.h | 42 ++
drivers/net/ethernet/microchip/lan966x/Makefile | 4 +
.../net/ethernet/microchip/lan966x/lan966x_fdma.c | 51 +-
.../ethernet/microchip/lan966x/lan966x_fdma_pci.c | 656 +++++++++++++++++++++
.../net/ethernet/microchip/lan966x/lan966x_main.c | 74 ++-
.../net/ethernet/microchip/lan966x/lan966x_main.h | 45 ++
.../net/ethernet/microchip/lan966x/lan966x_regs.h | 25 +
.../net/ethernet/microchip/lan966x/lan966x_xdp.c | 10 +
14 files changed, 1117 insertions(+), 40 deletions(-)
---
base-commit: 790ead9394860e7d70c5e0e50a35b243e909a618
change-id: 20260313-lan966x-pci-fdma-94ed485d23fa
Best regards,
--
Daniel Machon <daniel.machon at microchip.com>
More information about the linux-arm-kernel
mailing list