[PATCH v2] riscv: cacheinfo: Fix using smp_processor_id() in preemptible

Palmer Dabbelt palmerdabbelt at google.com
Tue Dec 22 23:53:51 EST 2020


On Tue, 22 Dec 2020 08:23:06 PST (-0800), arnd at kernel.org wrote:
> On Tue, Dec 22, 2020 at 5:01 PM Kefeng Wang <wangkefeng.wang at huawei.com> wrote:
>>
>> Use raw_smp_processor_id instead of smp_processor_id() to fix warning,
>>
>> BUG: using smp_processor_id() in preemptible [00000000] code: init/1
>> caller is debug_smp_processor_id+0x1c/0x26
>> CPU: 0 PID: 1 Comm: init Not tainted 5.10.0-rc4 #211
>> Call Trace:
>>   walk_stackframe+0x0/0xaa
>>   show_stack+0x32/0x3e
>>   dump_stack+0x76/0x90
>>   check_preemption_disabled+0xaa/0xac
>>   debug_smp_processor_id+0x1c/0x26
>>   get_cache_size+0x18/0x68
>>   load_elf_binary+0x868/0xece
>>   bprm_execve+0x224/0x498
>>   kernel_execve+0xdc/0x142
>>   run_init_process+0x90/0x9e
>>   try_to_run_init_process+0x12/0x3c
>>   kernel_init+0xb4/0xf8
>>   ret_from_exception+0x0/0xc
>>
>> The issue is found when CONFIG_DEBUG_PREEMPT enabled.
>
> The warning sounds legitimate.
>
>>  static struct cacheinfo *get_cacheinfo(u32 level, enum cache_type type)
>>  {
>> -       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(smp_processor_id());
>> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(raw_smp_processor_id());
>
> And this just hides the warning if someone calls this function
> without disabling preemption first.
>
> If you cannot use the normal get_cpu()/put_cpu() interfaces or
> the cacheinfo of the boot CPU, maybe add a comment to explain
> why raw_smp_processor_id() is necessary here.

The problem is actually bigger: the interface we're using to provide this
information to userspace just assumes there's a homogeneous cache hierarchy
(and a fixed number of levels).  There's nothing we can do with the hart ID to
fix that, so at a bare minimum we need some mechanism to stop providing this
information when it can't be accurate.  That would likely sort out this issue
as well, as we wouldn't be coupling this to the current processor ID.

That said, I'm somewhat inclined to take this (or something like it), as the
warning is somewhat noisy and the real problem is something different.  We
probably need a new interface at some point, but we don't have any systems that
violate the current one so it's not likely to be all that high priority.

Certainly warrants a comment, though (and I also had mine just using a static
hart id, so at least users get deterministic results).



More information about the linux-riscv mailing list