[bug report] soc: xilinx: Fix for call trace due to the usage of smp_processor_id()

Dan Carpenter dan.carpenter at linaro.org
Thu Feb 29 23:53:52 PST 2024


On Thu, Feb 29, 2024 at 08:39:57AM +0000, Jain, Ronak wrote:
> Hi Dan,
> 
> > -----Original Message-----
> > From: Dan Carpenter <dan.carpenter at linaro.org>
> > Sent: Tuesday, February 27, 2024 5:06 PM
> > To: Jain, Ronak <ronak.jain at amd.com>
> > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>;
> > haribabu.gattem at xilinx.com; Simek, Michal <michal.simek at amd.com>;
> > linux-arm-kernel at lists.infradead.org
> > Subject: Re: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > smp_processor_id()
> > 
> > On Tue, Feb 27, 2024 at 10:10:45AM +0000, Jain, Ronak wrote:
> > > Hi Dan,
> > >
> > > > -----Original Message-----
> > > > From: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>
> > > > Sent: Tuesday, February 27, 2024 3:25 PM
> > > > To: Dan Carpenter <dan.carpenter at linaro.org>;
> > > > haribabu.gattem at xilinx.com; Jain, Ronak <ronak.jain at amd.com>
> > > > Cc: Simek, Michal <michal.simek at amd.com>; linux-arm-
> > > > kernel at lists.infradead.org
> > > > Subject: RE: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > > smp_processor_id()
> > > >
> > > > + at Jain, Ronak
> > > >
> > > > > -----Original Message-----
> > > > > From: Dan Carpenter <dan.carpenter at linaro.org>
> > > > > Sent: Thursday, February 1, 2024 5:50 PM
> > > > > To: haribabu.gattem at xilinx.com
> > > > > Cc: Buddhabhatti, Jay <jay.buddhabhatti at amd.com>; Simek, Michal
> > > > > <michal.simek at amd.com>; linux-arm-kernel at lists.infradead.org
> > > > > Subject: [bug report] soc: xilinx: Fix for call trace due to the usage of
> > > > > smp_processor_id()
> > > > >
> > > > > Hello HariBabu Gattem,
> > > > >
> > > > > The patch daed80ed0758: "soc: xilinx: Fix for call trace due to the usage
> > of
> > > > > smp_processor_id()" from Oct 26, 2023 (linux-next), leads to the
> > following
> > > > > Smatch static checker warning:
> > > > >
> > > > > 	kernel/irq/manage.c:2614 __request_percpu_irq()
> > > > > 	warn: sleeping in atomic context
> > > > >
> > > > > drivers/soc/xilinx/xlnx_event_manager.c
> > > > >    610          cpu = get_cpu();
> > > > >                 ^^^^^^^^^^^^^^^
> > > > > The patch adds get_cpu() which disables preemption.
> > > > >
> > > > >    611          per_cpu(cpu_number1, cpu) = cpu;
> > > > >    612          ret = request_percpu_irq(virq_sgi, xlnx_event_handler,
> > > > > "xlnx_event_mgmt",
> > > > >                       ^^^^^^^^^^^^^^^^^^
> > > > > request_percpu_irq() does a sleeping allocation so it's a sleeping in
> > atomic
> > > > bug.
> > > > >
> > > > >    613                                   &cpu_number1);
> > > > >    614          put_cpu();
> > > > >
> > >
> > > I am working on this issue, and I tried to reproduce the issue but I
> > > couldn't. First, I tried enabling the below config flags for kernel
> > > preemption and then ran the smatch command but didn't get the warning
> > > you were mentioning.
> > >
> > > The configs I enabled,
> > > CONFIG_DEBUG_ATOMIC_SLEEP=y
> > 
> > This config would detect the issue at runtime.
> > 
> > > CONFIG_PREEMPT_RT=y
> > > CONFIG_PREEMPT=y
> > > CONFIG_PREEMPT_COUNT=y
> > > CONFIG_PREEMPTION=y
> > > CONFIG_PREEMPT_RCU=y
> > > CONFIG_DEBUG_PREEMPT=y
> > >
> > > The smatch command I ran,
> > > ~/smatch/smatch_scripts/kchecker --spammy
> > drivers/soc/xilinx/xlnx_event_manager.c
> > > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > > make ARCH=arm64 CHECK="~/smatch/smatch -p=kernel" C=2
> > drivers/soc/xilinx/xlnx_event_manager.o
> > >
> > > Could you please help me with the config you used for this issue and
> > > would be good if you could share the complete steps(the smatch
> > > commands) to reproduce the issue.
> > 
> > I'm sorry, to generate this warning you need to rebuild the cross
> > function database a few times.  Which is simple enough, but each time
> > you rebuild the database takes something like 6 hours.
> > 
> > smatch_scripts/build_kernel_data.sh
> > 
> > Each time you rebuild it, it adds one more branch to the call trees.  In
> > this case building it twice would work I guess.  Then it would be:
> > ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> > 
> > That prints the warning quoted above and then I do:
> > 
> > ~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq
> > xlnx_event_init_sgi() <- disables preempt
> > -> request_percpu_irq()
> >    -> __request_percpu_irq()
> > 
> I tried the steps you suggested but didn't get the warning you reported.
> 
> First, I tried to build a database using "smatch_scripts/build_kernel_data.sh" and then ran "~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c" multiple times but didn't encounter the original warning. Also, you mentioned that the building of the database takes around 6 hours to complete but for me, it took around 15 minutes for Xilinx internal repo and 1 hour for upstream repo i.e. linux-next.
> 
> Let me describe the overall steps I followed,
> 
> - Clone upstream linux-repo i.e. linux-next
> - export ARCH=arm64
> - export CROSS_COMPILE="aarch64-linux-gnu-"
> - make menuconfig (kept the default configs) 
> Or
> - make xilinx_defconfig (Xilinx specific config)
> - ~/smatch/smatch_scripts/build_kernel_data.sh 
> - wait for completion
> - ~/smatch/smatch_scripts/kchecker --spammy kernel/irq/manage.c
> Logs after running the kchecker
> 
> CHECK   scripts/mod/empty.c
>   CALL    scripts/checksyscalls.sh
>   CHECK   arch/arm64/kernel/vdso/vgettimeofday.c
>   CC      kernel/irq/manage.o
>   CHECK   kernel/irq/manage.c
> kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was already freed.
> kernel/irq/manage.c:583 irq_set_affinity_notifier() warn: 'old_notify' was already freed.
> kernel/irq/manage.c:583 irq_set_affinity_notifier() error: dereferencing freed memory 'old_notify'
> kernel/irq/manage.c:2210 request_threaded_irq() warn: calling kfree() when 'action->secondary' is always NULL.
> 
> Can you please review the overall steps I followed? 

Hm... That's strange.

What happens if you type:

	~/smatch/smatch_data/db/smdb.py preempt request_percpu_irq

It should say that xlnx_event_init_sgi() calls it with preempt disabled.
This command should work as well if the database has been rebuilt
twice but evidently it doesn't.  :/

	~/smatch/smatch_data/db/smdb.py preempt __request_percpu_irq

You have CONFIG_PREEMPT_COUNT=y so Smatch should have all the
information necessary to track this.

The use after free warning is because Smatch assumes kref_put() will
call the free function unless the refcount is incremented within the
same function.

On my system I don't get the 'action->secondary' warning because the
cross function database has been built more...  I would have thought
that building the cross function database twice would silence that
warning because the first build would add irq_setup_forced_threading()
and the next one would add __setup_irq().

$ smdb.py return_states irq_setup_forced_threading | grep "secondary "
kernel/irq/manage.c | irq_setup_forced_threading | 687 |             0|       PARAM_ADD |  0 |         $->secondary |  2784508726059343872 |

$ smdb.py return_states __setup_irq  | grep "secondary "
kernel/irq/manage.c | __setup_irq | 756 |             0|       PARAM_ADD |  2 |         $->secondary |       0,4096-ptr_max |

But to be honest, I hadn't realized that warning was published.  I
should delete it.  Calling kfree() on a NULL is not a bug.

regards,
dan carpenter



More information about the linux-arm-kernel mailing list