am335x: 5.18.x: system stalling

Arnd Bergmann arnd at arndb.de
Thu May 26 23:38:42 PDT 2022


On Fri, May 27, 2022 at 6:44 AM Yegor Yefremov
<yegorslists at googlemail.com> wrote:
> On Thu, May 26, 2022 at 4:16 PM Arnd Bergmann <arnd at arndb.de> wrote:
> >
> > On Thu, May 26, 2022 at 2:37 PM Yegor Yefremov
> > <yegorslists at googlemail.com> wrote:
> > > On Thu, May 26, 2022 at 10:19 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> > > >
> > > > On Thu, 26 May 2022 at 08:20, Tony Lindgren <tony at atomide.com> wrote:
> > > > >
> > > > > * Yegor Yefremov <yegorslists at googlemail.com> [220526 05:45]:
> > > > > > On Tue, May 24, 2022 at 4:19 PM Tony Lindgren <tony at atomide.com> wrote:
> > > > > > > Maybe also try with CONFIG_MUSB_PIO_ONLY=y to see if it makes things
> > > > > > > better or worse :)
> > > > > >
> > > > > > PIO is always the last resort :-) And now it proves it again. With
> > > > > > PIO_ONLY the system doesn't stall.
> > > > >
> > > > > OK great :) So it has something to do with drivers/dma/ti/cppi41.c, or
> > > > > with drivers/usb/musb/cppi_dma.c or whatever the dma for am335x here
> > > > > is. Or maybe there's something using stack for buffers being passed to
> > > > > dma again that breaks with vmap stack.
> > > > >
> > > >
> > > > In order to confirm this theory, could you please try rebuilding your
> > > > kernel with CONFIG_VMAP_STACK disabled, and leave everything else as
> > > > before?
> > >
> > > I have disabled the CONFIG_VMAP_STACK option:
> > >
> > > # zcat /proc/config.gz | grep VMAP_STACK
> > > CONFIG_HAVE_ARCH_VMAP_STACK=y
> > > # CONFIG_VMAP_STACK is not set
> > >
> > > The system stalls.
> >
> > Ok, I guess that means we can stop looking for invalid DMA buffers
> > on stacks. Out of the original commits you listed as possible causes,
> > we can also rule out 23d9a9280efe ("ARM: 9177/1: disable vmap'ed
> > stacks on suspend-capable SMP configs") and cafc0eab1689
> > ("ARM: v7m: enable support for IRQ stacks"). It could still be
> > 9c46929e7989 ("ARM: implement THREAD_INFO_IN_TASK for
> > uniprocessor systems") and 5fe41793bc78 ("ARM: 9176/1: avoid
> > literal references in inline assembly") or possibly the merge.
> >
> > Can you post the whole .config file somewhere for reference?
> > In particular, do you have CONFIG_SMP, CONFIG_LD_IS_LLD
> > or CURRENT_POINTER_IN_TPIDRURO set?
>
> This is my config [1] and this is the system in question [2].
>
> [1] https://github.com/visionsystemsgmbh/onrisc_br_bsp/blob/master/board/vscom/baltos/linux-experimental-config

Thanks! The first thing I noticed in here is that this config enables both
CONFIG_ARCH_MULTI_V6 (for OMAP2) and CONFIG_SMP, which
gets you into a couple of corner cases that nobody else hits in practice.

Can you still reproduce the problem if you turn off both of these?

       Arnd



More information about the linux-arm-kernel mailing list