REGRESSION in 6.0-rc7 caused by patch "ARM/dma-mapping: use dma-direct unconditionally"

Marcin Wojtas mw at semihalf.com
Mon Oct 3 15:03:26 PDT 2022


Hi,


pon., 3 paź 2022 o 23:35 Pali Rohár <pali at kernel.org> napisał(a):
>
> Hello!
>
> On Monday 03 October 2022 23:30:31 Marcin Wojtas wrote:
> > Hi Marek,
> >
> >
> > pon., 3 paź 2022 o 17:33 Marek Behún <kabel at kernel.org> napisał(a):
> > >
> > > On Mon, 3 Oct 2022 15:11:44 +0100
> > > "Russell King (Oracle)" <linux at armlinux.org.uk> wrote:
> > >
> > > > On Mon, Oct 03, 2022 at 09:30:37AM +0200, Christoph Hellwig wrote:
> > > > > On Fri, Sep 30, 2022 at 05:02:05PM +0200, Marek Behún wrote:
> > > > > > It seems that the null pointer dereference comes from the data variable
> > > > > > having zero value. We assign
> > > > > >   data = (u8 *)(uintptr_t)rx_desc->buf_cookie;
> > > > >
> > > > > I never see any assignment to ->buf_cookie in the driver, what am
> > > > > I missing?
> > > >
> > > > I think Marek's setup (like my setups) use the hardware buffer manager,
> > > > and it's hardware that fills in the "buf_cookie", which is supposed to
> > > > be the virtual address of the buffer.
> > > >
> > > > Each buffer supplied to the hardware buffer manager is supposed to
> > > > contain the virtual address in the first 32-bit word in that buffer.
> > > >
> > > > This is done by mvneta_bm_construct():
> > > >
> > > >         /* In order to update buf_cookie field of RX descriptor properly,
> > > >          * BM hardware expects buf virtual address to be placed in the
> > > >          * first four bytes of mapped buffer.
> > > >          */
> > > >         *(u32 *)buf = (u32)buf;
> > > >
> > > > immediately prior to dma_map_single(..., DMA_FROM_DEVICE) is called.
> > > >
> > > > If I had to guess, I would suggest that this write is being lost via
> > > > cache invalidation, and given that the hardware BM both reads and
> > > > writes this buffer, DMA_FROM_DEVICE is not correct, it should be
> > > > DMA_BIDIRECTIONAL.
> > > >
> >
> > I think the DMA_FROM_DEVICE is used rather properly in the RX path of
> > the driver - the CPU doesn't access the payload afterward. The BM only
> > pushes the pointers back and forth between internal SRAM ('internal
> > pool' - BPPI) to DRAM ('external pool' - BPPE) and the descriptors,
> > but afaik it should not touch the buffer contents. But maybe somehow
> > it affects the coherency and DMA_BIDIRECTIONAL are indeed required...
> > About the coherency itself - please see my comment below.
> >
> > Another thought - when writing to *buf (memory normal) shouldn't we add a dsb()?
>
> Hm... When you are talking about IO coherency... This reminds me that
> there is a HW bug on A38x with undocumented errata related to IO
> coherency. It is not mentioned in the official A38x errata document but
> kernel has implemented some kind of workaround for it. But last time
> when I tried to understand it I had feeling that it was implemented
> incorrectly and maybe does not do what it is expected to do?
>
> Maybe it should be revisited, now?

It mentions problems with deadlocks when outer syncs are performed:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mach-mvebu/coherency.c?h=v6.0#n183
A couple of lines above __arm_ioremap_caller is overriden, but IMO
it's not related to what we see in our issue.

If that may help - despite lack of 'dma-coherent' property in DT, the
FreeBSD works with full IO cache-coherency (e.g. cache maintenance in
NETA drivers are effectively nop's) on Clearfog and Clearfog-like
devices in production. So it's definitely possible, and I'd like to
check that in Linux (I may also have a setup this week available).

Best regards,
Marcin

>
> > > > Changing that is probably going to need DMA_FROM_DEVICE also changed
> > > > elsewhere in the mvneta_bm and mvneta driver.
> > > >
> > > > I'm not in a position where I could test that out. Marek?
> > > >
> > >
> > > Hello Russell,
> > >
> > > thanks for your suggestion!
> > >
> > > Adding Pali, since he has some information (see at the end of this
> > > message).
> > >
> > > The attached patch seems to solve the null-pointer dereference.
> >
> > Did you manage to measure performance impact?
> >
> > I have one overall concern here. On all kinds of A38x-based boards I
> > worked on, by default, the firmware set all devices (e.g. network,
> > AHCI, XHCI) on MBUS as fully IO cache coherent - it should be
> > reflected in the MVNETA_WIN_BASE(w) registers attribute field. Bits
> > [15:8] should be set to 0x1D (or 0x1E if there is a second DRAM CS
> > used). Can you please try adding 'dma-coherent;' property under the
> > 'internal-regs' node?
> >
> > > I booted into single user mode and enabled eth2. Before it caused the
> > > NULL pointer dereference after link got up, not it does not happen.
> > >
> > > But I am still encountering the freeze after booting into system.
> > >
> > > Maybe these are different bugs?
> > >
> > > I am thinking whether we don't need something similar like
> > >   7bea67a99430 ("ARM: dts integrator: Fix DMA ranges")
> > > also for mvebu.
> > > I seem to remember Pali talking about how the ranges defined in some
> > > upstream mvebu-tree, using MBUS_ID() macros, are incorrect.
> > > Pali, what do you remember about this?
> > >
> >
> > Afaik there should be no issues around MBUS configuration for non-PCIE devices.
> >
> > Best regards,
> > Marcin



More information about the linux-arm-kernel mailing list