[bug report] soc: xilinx: Fix for call trace due to the usage of smp_processor_id()

Jain, Ronak ronak.jain at amd.com
Fri Mar 1 03:44:12 PST 2024


Hi Dan,

> -----Original Message-----
> From: Dan Carpenter <dan.carpenter at linaro.org>
> Sent: Friday, March 1, 2024 1:24 PM
> To: Jain, Ronak <ronak.jain at amd.com>
> Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>;
> haribabu.gattem at xilinx.com; Simek, Michal <michal.simek at amd.com>;
> linux-arm-kernel at lists.infradead.org
> Subject: Re: [bug report] soc: xilinx: Fix for call trace due to the usage of
> smp_processor_id()
> 
> On Thu, Feb 29, 2024 at 08:39:57AM +0000, Jain, Ronak wrote:
> > Hi Dan,
> >
> > > -----Original Message-----
> > > From: Dan Carpenter <dan.carpenter at linaro.org>
> > > Sent: Tuesday, February 27, 2024 5:06 PM
> > > To: Jain, Ronak <ronak.jain at amd.com>
> > > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>;
> > > haribabu.gattem at xilinx.com; Simek, Michal <michal.simek at amd.com>;
> > > linux-arm-kernel at lists.infradead.org
> > > Subject: Re: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > smp_processor_id()
> > >
> > > On Tue, Feb 27, 2024 at 10:10:45AM +0000, Jain, Ronak wrote:
> > > > Hi Dan,
> > > >
> > > > > -----Original Message-----
> > > > > From: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>
> > > > > Sent: Tuesday, February 27, 2024 3:25 PM
> > > > > To: Dan Carpenter <dan.carpenter at linaro.org>;
> > > > > haribabu.gattem at xilinx.com; Jain, Ronak <ronak.jain at amd.com>
> > > > > Cc: Simek, Michal <michal.simek at amd.com>; linux-arm-
> > > > > kernel at lists.infradead.org
> > > > > Subject: RE: [bug report] soc: xilinx: Fix for call trace due to the usage
> of
> > > > > smp_processor_id()
> > > > >
> > > > > + at Jain, Ronak
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Dan Carpenter <dan.carpenter at linaro.org>
> > > > > > Sent: Thursday, February 1, 2024 5:50 PM
> > > > > > To: haribabu.gattem at xilinx.com
> > > > > > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>; Simek, Michal
> > > > > > <michal.simek at amd.com>; linux-arm-kernel at lists.infradead.org
> > > > > > Subject: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > > > > smp_processor_id()
> > > > > >
> > > > > > Hello HariBabu Gattem,
> > > > > >
> > > > > > The patch daed80ed0758: "soc: xilinx: Fix for call trace due to the
> usage
> > > of
> > > > > > smp_processor_id()" from Oct 26, 2023 (linux-next), leads to the
> > > following
> > > > > > Smatch static checker warning:
> > > > > >
> > > > > > 	kernel/irq/manage.c:2614 __request_percpu_irq()
> > > > > > 	warn: sleeping in atomic context
> > > > > >
> > > > > > drivers/soc/xilinx/xlnx_event_manager.c
> > > > > >    610          cpu = get_cpu();
> > > > > >                 ^^^^^^^^^^^^^^^
> > > > > > The patch adds get_cpu() which disables preemption.
> > > > > >
> > > > > >    611          per_cpu(cpu_number1, cpu) = cpu;
> > > > > >    612          ret = request_percpu_irq(virq_sgi, xlnx_event_handler,
> > > > > > "xlnx_event_mgmt",
> > > > > >                       ^^^^^^^^^^^^^^^^^^
> > > > > > request_percpu_irq() does a sleeping allocation so it's a sleeping in
> > > atomic
> > > > > bug.
> > > > > >
> > > > > >    613                                   &cpu_number1);
> > > > > >    614          put_cpu();
> > > > > >
> > > >
> > > > I am working on this issue, and I tried to reproduce the issue but I
> > > > couldn't. First, I tried enabling the below config flags for kernel
> > > > preemption and then ran the smatch command but didn't get the
> warning
> > > > you were mentioning.
> > > >
> > > > The configs I enabled,
> > > > CONFIG_DEBUG_ATOMIC_SLEEP=y
> > >
> > > This config would detect the issue at runtime.
> > >
> > > > CONFIG_PREEMPT_RT=y
> > > > CONFIG_PREEMPT=y
> > > > CONFIG_PREEMPT_COUNT=y
> > > > CONFIG_PREEMPTION=y
> > > > CONFIG_PREEMPT_RCU=y
> > > > CONFIG_DEBUG_PREEMPT=y
> > > >
> > > > The smatch command I ran,
> > > > ~/smatch/smatch_scripts/kchecker --spammy
> > > drivers/soc/xilinx/xlnx_event_manager.c
> > > > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > > > make ARCH=arm64 CHECK="~/smatch/smatch -p=kernel" C=2
> > > drivers/soc/xilinx/xlnx_event_manager.o
> > > >
> > > > Could you please help me with the config you used for this issue and
> > > > would be good if you could share the complete steps(the smatch
> > > > commands) to reproduce the issue.
> > >
> > > I'm sorry, to generate this warning you need to rebuild the cross
> > > function database a few times.  Which is simple enough, but each time
> > > you rebuild the database takes something like 6 hours.
> > >
> > > smatch_scripts/build_kernel_data.sh
> > >
> > > Each time you rebuild it, it adds one more branch to the call trees.  In
> > > this case building it twice would work I guess.  Then it would be:
> > > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > >
> > > That prints the warning quoted above and then I do:
> > >
> > > ~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
> > > xlnx_event_init_sgi() <- disables preempt
> > > -> request_percpu_irq()
> > >    -> __request_percpu_irq()
> > >
> > I tried the steps you suggested but didn't get the warning you reported.
> >
> > First, I tried to build a database using "smatch_scripts/build_kernel_data.sh"
> and then ran "~/smatch/smatch_scripts/kchecker --spammy
> kernel/irq/manage.c" multiple times but didn't encounter the original
> warning. Also, you mentioned that the building of the database takes around
> 6 hours to complete but for me, it took around 15 minutes for Xilinx internal
> repo and 1 hour for upstream repo i.e. linux-next.
> >
> > Let me describe the overall steps I followed,
> >
> > - Clone upstream linux-repo i.e. linux-next
> > - export ARCH=arm64
> > - export CROSS_COMPILE="aarch64-linux-gnu-"
> > - make menuconfig (kept the default configs)
> > Or
> > - make xilinx_defconfig (Xilinx specific config)
> > - ~/smatch/smatch_scripts/build_kernel_data.sh
> > - wait for completion
> > - ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > Logs after running the kchecker
> >
> > CHECK   scripts/mod/empty.c
> >   CALL    scripts/checksyscalls.sh
> >   CHECK   arch/arm64/kernel/vdso/vgettimeofday.c
> >   CC      kernel/irq/manage.o
> >   CHECK   kernel/irq/manage.c
> > kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was
> already freed.
> > kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was
> already freed.
> > kernel/irq/manage.c:583 irq_set_affinity_notifier() error: dereferencing
> freed memory 'old_notify'
> > kernel/irq/manage.c:2210 request_threaded_irq() warn: calling kfree()
> when 'action->secondary' is always NULL.
> >
> > Can you please review the overall steps I followed?
> 
> Hm... That's strange.
> 
> What happens if you type:
> 
> 	~/smatch/smatch_data/db/smdb.py preempt request_percpu_irq
Now, I can see the error after running the command. In my previous test, I was using the default config in which I didn't enable any PREEMT related configs. Today, I tried with PREEMT configs and build the database and after the second attempt I got the message,

~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
xlnx_event_init_sgi() <- disables preempt
-> request_percpu_irq()
   -> __request_percpu_irq()

However, with the kchecker command still I am not getting the actual warning.

Anyway, I am unblocked now and can proceed with the solution.

Thanks,
Ronak
> 
> It should say that xlnx_event_init_sgi() calls it with preempt disabled.
> This command should work as well if the database has been rebuilt
> twice but evidently it doesn't.  :/
> 
> 	~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
> 
> You have CONFIG_PREEMPT_COUNT=y so Smatch should have all the
> information necessary to track this.
> 
> The use after free warning is because Smatch assumes kref_put() will
> call the free function unless the refcount is incremented within the
> same function.
> 
> On my system I don't get the 'action->secondary' warning because the
> cross function database has been built more...  I would have thought
> that building the cross function database twice would silence that
> warning because the first build would add irq_setup_forced_threading()
> and the next one would add __setup_irq().
> 
> $ smdb.py return_states irq_setup_forced_threading | grep "secondary "
> kernel/irq/manage.c | irq_setup_forced_threading | 687 |             0|
> PARAM_ADD |  0 |         $->secondary |  2784508726059343872 |
> 
> $ smdb.py return_states __setup_irq  | grep "secondary "
> kernel/irq/manage.c | __setup_irq | 756 |             0|       PARAM_ADD |  2 |
> $->secondary |       0,4096-ptr_max |
> 
> But to be honest, I hadn't realized that warning was published.  I
> should delete it.  Calling kfree() on a NULL is not a bug.
> 
> regards,
> dan carpenter



More information about the linux-arm-kernel mailing list