[PATCH master 00/23] treewide: fix bugs using DMA API
Ahmad Fatoum
a.fatoum at pengutronix.de
Tue Apr 23 23:40:35 PDT 2024
As described in commi b986aad24ab8 ("mci: core: allocate memory used
for DMA with dma_alloc"), the recent fix to ARMv8 cache operations in
commit 65ef5d885263 ("ARM64: let 'end' point after the range in cache
functions") may lead to unearthing some of the alignment bugs we have:
These bugs were already there: If a DMA buffer is misaligned and you do
cache maintenance on it, you will corrupt memory that's unlucky to share
the cache line. This has been the case for many years though, which I
think is because that corruption was limited to the driver itself:
If a driver invalidates only part of its buffer, then that is its
problem and that of its consumers (e.g. TFTP failing for some file
names, because network driver only invalidated part of the packet).
When we start correctly invalidating the whole buffer though,
invalidaing misaligned buffers will lead us to possibly corrupt other
allocations after it, which makes the problem less localized.
Anyhow, the fix is correct and I spent some time going through all our
allocations to check whether they adhere to the DMA API. Having some way
to encode this into the type system would be nice for the future (maybe
something via named address spaces[1]), but for now I took the laborious
way of grepping for all /alloc/, /dma_map_single/ and /dma_sync_for/ we
have and checking them by hand.
I intend to document our expectation around the DMA API soon, but for
now, with this series applied our expectations are as follows:
- Streaming DMA is only permissible with suitably aligned buffers,
e.g. those allocated with dma_alloc()
- DMA to stack needs to be eradicated. We currently seem to do this
in three places still: HABv4, Raspberry Pi mailbox and some Virt I/O
- "User" code should not need to call dma_alloc(). Buffers passed
to read/write or cdev_read/cdev_write should be able to have
arbitrary alignment. We could add a future "zero-copy" way, but
currently drivers either use bounce buffers (e.g. RAW NAND with
Denali, CAAM crypto or qemu_fw_cfg) or intermediate layers handle it
(e.g. block cache for MMC, ATA, NVMe), so user code need not worry.
- USB buffers and Network packets should always be allocated with
dma_alloc (or net_alloc_packet). No exceptions.
- Especially network drivers must call dma_map_single on receive buffers
once allocated and before bringing up the interface. Otherwise we have
a race between CPU cache and device DMA. This applies to other users
as well, but not observing it is less problematic, because e.g. MMC
reads are synchronous while NIC RX is async.
- Kernel code often does DMA to buffers allocated with kmalloc and
friends. kmalloc now calls dma_alloc instead of normal malloc to
maintain kernel compatibility.
Tested on top of master on STM32MP1 (MC-1), AM335 (Beaglebone Black),
BCM2711 (Raspberry Pi 4 32-bit), BCM2835 (Raspberry Pi 3 32-bit),
i.MX6 (RIoT-Board), RK3568 (Rock 3A), i.MX8MP (TQMA8MPXL) and
i.MX8MN (EVK).
[1]: SO/IEC JTC1 SC22 WG14 N1275
Ahmad Fatoum (23):
habv4: use DMA-capable memory for getting event from BootROM
dma: give inline dma_alloc a single external definition
dma: add definition for dma_zalloc
include: linux/kernel.h: factor out alignment macros
driver: move out struct device definition into its own header
dma: remove common.h include from asm/dma.h
RISC-V: dma: fix dma.h inclusion
sandbox: dma: drop unused driver.h include
dma: remove linux/kernel.h dependency from dma.h
include: linux/slab: fix possible overflow in kmalloc_array
include: linux/slab: use dma_alloc for kmalloc
include: linux/slab: retire krealloc
commands: mmc_extcsd: use DMA capable memory where needed
net: macb: use DMA-capable memory for receive buffer
firmware: qemu_fw_cfg: use bounce buffer for write
net: usb: asix: use dma_alloc for buffers in USB control messages
net: usb: smsc95xx: use DMA memory for usb_control_msg
usb: hub: use DMA memory in usb_get_port_status
usb: hub: use DMA-capable memory in usb_hub_configure
treewide: use new dma_zalloc instead of opencoding
usb: dwc2: host: fix mismatch between dma_map_single and unmap
net: bcmgenet: map DMA buffers with dma_map_single
dma: debug: add alignment check when mapping buffers
arch/arm/include/asm/dma.h | 5 +-
arch/kvx/include/asm/dma.h | 4 +-
arch/mips/include/asm/dma.h | 3 +-
arch/mips/lib/dma-default.c | 1 +
arch/riscv/cpu/dma.c | 2 +-
arch/riscv/include/asm/dma.h | 2 -
arch/sandbox/include/asm/dma.h | 1 -
commands/mmc_extcsd.c | 4 +-
drivers/dma/debug.c | 5 +
drivers/dma/map.c | 17 ++++
drivers/firmware/qemu_fw_cfg.c | 20 +++-
drivers/hab/habv4.c | 3 +-
drivers/net/bcmgenet.c | 13 +--
drivers/net/fsl-fman.c | 4 +-
drivers/net/macb.c | 4 +-
drivers/net/usb/asix.c | 8 +-
drivers/net/usb/smsc95xx.c | 15 ++-
drivers/soc/starfive/jh7100_dma.c | 2 +-
drivers/usb/core/hub.c | 49 ++++++----
drivers/usb/dwc2/host.c | 4 +-
drivers/usb/gadget/function/f_fastboot.c | 3 +-
drivers/video/mipi_dbi.c | 3 +-
fs/ext4/ext4_common.h | 10 +-
include/device.h | 111 +++++++++++++++++++++++
include/dma.h | 16 +++-
include/driver.h | 93 +------------------
include/linux/align.h | 13 +++
include/linux/device.h | 2 -
include/linux/kernel.h | 9 +-
include/linux/pagemap.h | 2 +-
include/linux/slab.h | 20 ++--
lib/kasan/test_kasan.c | 4 +-
32 files changed, 266 insertions(+), 186 deletions(-)
create mode 100644 include/device.h
create mode 100644 include/linux/align.h
--
2.39.2
More information about the barebox
mailing list