REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"

Marcin Wojtas mw at semihalf.com
Mon Oct 3 14:30:31 PDT 2022


Hi Marek,


pon., 3 paź 2022 o 17:33 Marek Behún <kabel at kernel.org> napisał(a):
>
> On Mon, 3 Oct 2022 15:11:44 +0100
> "Russell King (Oracle)" <linux at armlinux.org.uk> wrote:
>
> > On Mon, Oct 03, 2022 at 09:30:37AM +0200, Christoph Hellwig wrote:
> > > On Fri, Sep 30, 2022 at 05:02:05PM +0200, Marek Behún wrote:
> > > > It seems that the null pointer dereference comes from the data variable
> > > > having zero value. We assign
> > > >   data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
> > >
> > > I never see any assignment to ->buf_cookie in the driver, what am
> > > I missing?
> >
> > I think Marek's setup (like my setups) use the hardware buffer manager,
> > and it's hardware that fills in the "buf_cookie", which is supposed to
> > be the virtual address of the buffer.
> >
> > Each buffer supplied to the hardware buffer manager is supposed to
> > contain the virtual address in the first 32-bit word in that buffer.
> >
> > This is done by mvneta_bm_construct():
> >
> >         /* In order to update buf_cookie field of RX descriptor properly,
> >          * BM hardware expects buf virtual address to be placed in the
> >          * first four bytes of mapped buffer.
> >          */
> >         *(u32 *)buf = (u32)buf;
> >
> > immediately prior to dma_map_single(..., DMA_FROM_DEVICE) is called.
> >
> > If I had to guess, I would suggest that this write is being lost via
> > cache invalidation, and given that the hardware BM both reads and
> > writes this buffer, DMA_FROM_DEVICE is not correct, it should be
> > DMA_BIDIRECTIONAL.
> >

I think the DMA_FROM_DEVICE is used rather properly in the RX path of
the driver - the CPU doesn't access the payload afterward. The BM only
pushes the pointers back and forth between internal SRAM ('internal
pool' - BPPI) to DRAM ('external pool' - BPPE) and the descriptors,
but afaik it should not touch the buffer contents. But maybe somehow
it affects the coherency and DMA_BIDIRECTIONAL are indeed required...
About the coherency itself - please see my comment below.

Another thought - when writing to *buf (memory normal) shouldn't we add a dsb()?

> > Changing that is probably going to need DMA_FROM_DEVICE also changed
> > elsewhere in the mvneta_bm and mvneta driver.
> >
> > I'm not in a position where I could test that out. Marek?
> >
>
> Hello Russell,
>
> thanks for your suggestion!
>
> Adding Pali, since he has some information (see at the end of this
> message).
>
> The attached patch seems to solve the null-pointer dereference.

Did you manage to measure performance impact?

I have one overall concern here. On all kinds of A38x-based boards I
worked on, by default, the firmware set all devices (e.g. network,
AHCI, XHCI) on MBUS as fully IO cache coherent - it should be
reflected in the MVNETA_WIN_BASE(w) registers attribute field. Bits
[15:8] should be set to 0x1D (or 0x1E if there is a second DRAM CS
used). Can you please try adding 'dma-coherent;' property under the
'internal-regs' node?

> I booted into single user mode and enabled eth2. Before it caused the
> NULL pointer dereference after link got up, not it does not happen.
>
> But I am still encountering the freeze after booting into system.
>
> Maybe these are different bugs?
>
> I am thinking whether we don't need something similar like
>   7bea67a99430 ("ARM: dts integrator: Fix DMA ranges")
> also for mvebu.
> I seem to remember Pali talking about how the ranges defined in some
> upstream mvebu-tree, using MBUS_ID() macros, are incorrect.
> Pali, what do you remember about this?
>

Afaik there should be no issues around MBUS configuration for non-PCIE devices.

Best regards,
Marcin



More information about the linux-arm-kernel mailing list