[bug report] soc: xilinx: Fix for call trace due to the usage of smp_processor_id()
Dan Carpenter
dan.carpenter at linaro.org
Thu Feb 29 23:53:52 PST 2024
On Thu, Feb 29, 2024 at 08:39:57AM +0000, Jain, Ronak wrote:
> Hi Dan,
>
> > -----Original Message-----
> > From: Dan Carpenter <dan.carpenter at linaro.org>
> > Sent: Tuesday, February 27, 2024 5:06 PM
> > To: Jain, Ronak <ronak.jain at amd.com>
> > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>;
> > haribabu.gattem at xilinx.com; Simek, Michal <michal.simek at amd.com>;
> > linux-arm-kernel at lists.infradead.org
> > Subject: Re: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > smp_processor_id()
> >
> > On Tue, Feb 27, 2024 at 10:10:45AM +0000, Jain, Ronak wrote:
> > > Hi Dan,
> > >
> > > > -----Original Message-----
> > > > From: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>
> > > > Sent: Tuesday, February 27, 2024 3:25 PM
> > > > To: Dan Carpenter <dan.carpenter at linaro.org>;
> > > > haribabu.gattem at xilinx.com; Jain, Ronak <ronak.jain at amd.com>
> > > > Cc: Simek, Michal <michal.simek at amd.com>; linux-arm-
> > > > kernel at lists.infradead.org
> > > > Subject: RE: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > > smp_processor_id()
> > > >
> > > > + at Jain, Ronak
> > > >
> > > > > -----Original Message-----
> > > > > From: Dan Carpenter <dan.carpenter at linaro.org>
> > > > > Sent: Thursday, February 1, 2024 5:50 PM
> > > > > To: haribabu.gattem at xilinx.com
> > > > > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>; Simek, Michal
> > > > > <michal.simek at amd.com>; linux-arm-kernel at lists.infradead.org
> > > > > Subject: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > > > smp_processor_id()
> > > > >
> > > > > Hello HariBabu Gattem,
> > > > >
> > > > > The patch daed80ed0758: "soc: xilinx: Fix for call trace due to the usage
> > of
> > > > > smp_processor_id()" from Oct 26, 2023 (linux-next), leads to the
> > following
> > > > > Smatch static checker warning:
> > > > >
> > > > > kernel/irq/manage.c:2614 __request_percpu_irq()
> > > > > warn: sleeping in atomic context
> > > > >
> > > > > drivers/soc/xilinx/xlnx_event_manager.c
> > > > > 610 cpu = get_cpu();
> > > > > ^^^^^^^^^^^^^^^
> > > > > The patch adds get_cpu() which disables preemption.
> > > > >
> > > > > 611 per_cpu(cpu_number1, cpu) = cpu;
> > > > > 612 ret = request_percpu_irq(virq_sgi, xlnx_event_handler,
> > > > > "xlnx_event_mgmt",
> > > > > ^^^^^^^^^^^^^^^^^^
> > > > > request_percpu_irq() does a sleeping allocation so it's a sleeping in
> > atomic
> > > > bug.
> > > > >
> > > > > 613 &cpu_number1);
> > > > > 614 put_cpu();
> > > > >
> > >
> > > I am working on this issue, and I tried to reproduce the issue but I
> > > couldn't. First, I tried enabling the below config flags for kernel
> > > preemption and then ran the smatch command but didn't get the warning
> > > you were mentioning.
> > >
> > > The configs I enabled,
> > > CONFIG_DEBUG_ATOMIC_SLEEP=y
> >
> > This config would detect the issue at runtime.
> >
> > > CONFIG_PREEMPT_RT=y
> > > CONFIG_PREEMPT=y
> > > CONFIG_PREEMPT_COUNT=y
> > > CONFIG_PREEMPTION=y
> > > CONFIG_PREEMPT_RCU=y
> > > CONFIG_DEBUG_PREEMPT=y
> > >
> > > The smatch command I ran,
> > > ~/smatch/smatch_scripts/kchecker --spammy
> > drivers/soc/xilinx/xlnx_event_manager.c
> > > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > > make ARCH=arm64 CHECK="~/smatch/smatch -p=kernel" C=2
> > drivers/soc/xilinx/xlnx_event_manager.o
> > >
> > > Could you please help me with the config you used for this issue and
> > > would be good if you could share the complete steps(the smatch
> > > commands) to reproduce the issue.
> >
> > I'm sorry, to generate this warning you need to rebuild the cross
> > function database a few times. Which is simple enough, but each time
> > you rebuild the database takes something like 6 hours.
> >
> > smatch_scripts/build_kernel_data.sh
> >
> > Each time you rebuild it, it adds one more branch to the call trees. In
> > this case building it twice would work I guess. Then it would be:
> > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> >
> > That prints the warning quoted above and then I do:
> >
> > ~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
> > xlnx_event_init_sgi() <- disables preempt
> > -> request_percpu_irq()
> > -> __request_percpu_irq()
> >
> I tried the steps you suggested but didn't get the warning you reported.
>
> First, I tried to build a database using "smatch_scripts/build_kernel_data.sh" and then ran "~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c" multiple times but didn't encounter the original warning. Also, you mentioned that the building of the database takes around 6 hours to complete but for me, it took around 15 minutes for Xilinx internal repo and 1 hour for upstream repo i.e. linux-next.
>
> Let me describe the overall steps I followed,
>
> - Clone upstream linux-repo i.e. linux-next
> - export ARCH=arm64
> - export CROSS_COMPILE="aarch64-linux-gnu-"
> - make menuconfig (kept the default configs)
> Or
> - make xilinx_defconfig (Xilinx specific config)
> - ~/smatch/smatch_scripts/build_kernel_data.sh
> - wait for completion
> - ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> Logs after running the kchecker
>
> CHECK scripts/mod/empty.c
> CALL scripts/checksyscalls.sh
> CHECK arch/arm64/kernel/vdso/vgettimeofday.c
> CC kernel/irq/manage.o
> CHECK kernel/irq/manage.c
> kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was already freed.
> kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was already freed.
> kernel/irq/manage.c:583 irq_set_affinity_notifier() error: dereferencing freed memory 'old_notify'
> kernel/irq/manage.c:2210 request_threaded_irq() warn: calling kfree() when 'action->secondary' is always NULL.
>
> Can you please review the overall steps I followed?
Hm... That's strange.
What happens if you type:
~/smatch/smatch_data/db/smdb.py preempt request_percpu_irq
It should say that xlnx_event_init_sgi() calls it with preempt disabled.
This command should work as well if the database has been rebuilt
twice but evidently it doesn't. :/
~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
You have CONFIG_PREEMPT_COUNT=y so Smatch should have all the
information necessary to track this.
The use after free warning is because Smatch assumes kref_put() will
call the free function unless the refcount is incremented within the
same function.
On my system I don't get the 'action->secondary' warning because the
cross function database has been built more... I would have thought
that building the cross function database twice would silence that
warning because the first build would add irq_setup_forced_threading()
and the next one would add __setup_irq().
$ smdb.py return_states irq_setup_forced_threading | grep "secondary "
kernel/irq/manage.c | irq_setup_forced_threading | 687 | 0| PARAM_ADD | 0 | $->secondary | 2784508726059343872 |
$ smdb.py return_states __setup_irq | grep "secondary "
kernel/irq/manage.c | __setup_irq | 756 | 0| PARAM_ADD | 2 | $->secondary | 0,4096-ptr_max |
But to be honest, I hadn't realized that warning was published. I
should delete it. Calling kfree() on a NULL is not a bug.
regards,
dan carpenter
More information about the linux-arm-kernel
mailing list