[PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending

Emil Renner Berthing kernel at esmil.dk
Mon Jul 6 04:08:24 EDT 2020


On Mon, 6 Jul 2020 at 02:48, Guo Ren <guoren at kernel.org> wrote:
> On Mon, Jul 6, 2020 at 1:09 AM Emil Renner Berthing <kernel at esmil.dk> wrote:
> >
> > On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren at kernel.org> wrote:
> > > On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel at esmil.dk> wrote:
> > > > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren at kernel.org> wrote:
> > > > >
> > > > > Hi Emil,
> > > > >
> > > > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel at esmil.dk> wrote:
> > > > > >
> > > > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > > > > results in many errors like this on the HiFive Unleashed
> > > > > > RISC-V board:
> > > > > >
> > > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > > > > caller is regmap_mmio_write32le+0x1c/0x46
> > > > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > > > > Call Trace:
> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > > > > >
> > > > > > Signed-off-by: Emil Renner Berthing <kernel at esmil.dk>
> > > > > > ---
> > > > > > This patch fixes it, but my guess is that it's not the right
> > > > > > fix. Do anyone have a better idea?
> > > > > >
> > > > > >  include/asm-generic/mmiowb.h | 6 +++++-
> > > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > > > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > > > > --- a/include/asm-generic/mmiowb.h
> > > > > > +++ b/include/asm-generic/mmiowb.h
> > > > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > > > > >
> > > > > >  static inline void mmiowb_set_pending(void)
> > > > > >  {
> > > > > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > > > > +       struct mmiowb_state *ms;
> > > > > > +
> > > > > > +       get_cpu();
> > > > > > +       ms = __mmiowb_state();
> > > > > >         ms->mmiowb_pending = ms->nesting_count;
> > > > > > +       put_cpu();
> > > > > >  }
> > > > >
> > > > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> > > > >
> > > > > The ptr is a fixed address, so don't worry about the change, and just
> > > > > use an atomic_read is enough.
> > > > > static inline void mmiowb_set_pending(void)
> > > > > {
> > > > >         struct mmiowb_state *ms = __mmiowb_state();
> > > > > -       ms->mmiowb_pending = ms->nesting_count;
> > > > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > > > > }
> > > >
> > > > You may be right, but it doesn't fix the BUG. As far as I can tell it
> > > > happens in __mmiowb_state() which expands through this_cpu_ptr and
> > > > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> > > > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> > > > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> > > > debug_smp_processor_id, which eventually checks that preemption is
> > > > disabled in check_preemption_disabled.
> > > Thx for explaining.
> > >
> > > Seems we need to find who disable preemption during:
> > > > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> >
> > Hmm.. the problem is that preemption is *not* disabled when
> > smp_processor_id is called, right?
>
> Yes!
>
> smp_processor_id is defined as:
>
>  * This is the normal accessor to the CPU id and should be used
>  * whenever possible.
>  *
>  * The CPU id is stable when:
>  *
>  *  - IRQs are disabled;
>  *  - preemption is disabled;
>  *  - the task is CPU affine.
>  *
>  * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN
>  * when smp_processor_id() is used when the CPU id is not stable.
>
> So regmap_write->regmap_mmio_write should be PREEMPT disabled in
> sifive_gpio_probe().

Ah! Sorry, now I think I understand. So you're saying that the real
problem is that the driver framework should have disabled preemption
before calling any .probe functions, but for some reason that doesn't
happen on RISC-V?

> >
> > > do_one_initcall's preempt_count = 0
> > > (gdb) bt
> > > #0  do_one_initcall (fn=0xffffffe000003b0e
> > > <trace_init_flags_sys_exit>) at init/main.c:1190
> > > #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
> > >
> > > #2  kernel_init_freeable () at init/main.c:1494
> > > #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
> > >    t init/main.c:1399
> > > #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
> > > Backtrace stopped: frame did not save the PC
> > > (gdb) p *(struct task_struct*)$tp
> > > $2 = {thread_info = {flags = 0, preempt_count = 0,
> > >
> > > Can you debug like this ? to see which function's preempt_count = 0 in
> > > your backtrace.
> >
> > I'm sorry, I'm not exactly sure what you mean by this, but it seems to
> > happen multiple places in writel-like functions. Both at probe time in
> > sifive_gpio_probe and sifive_spi_probe and when the spi driver is
> > running. Here is the full log:
> >
> I mean Use jtag + gdb to detect  preempt_count = 0 in parent function.
> I do this in qemu :)
>
> If couldn't use jtag, then just add printk preempt_count trace, or kgdb.

Right, thanks. I'll look into it.

/Emil



More information about the linux-riscv mailing list