qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking)

Geert Uytterhoeven geert at linux-m68k.org
Mon Apr 20 03:46:51 PDT 2015


On Sun, Apr 19, 2015 at 4:08 PM, Guenter Roeck <linux at roeck-us.net> wrote:
> On 04/19/2015 02:31 AM, Ingo Molnar wrote:
>> * Linus Torvalds <torvalds at linux-foundation.org> wrote:
>>
>>> On Sun, Apr 19, 2015 at 4:48 AM, Linus Torvalds
>>> <torvalds at linux-foundation.org> wrote:
>>>>
>>>>
>>>> Does that smaller patch work equally well?
>>>
>>>
>>> .. and here's a properly formatted email and patch.
>>>
>>>             Linus
>>
>>
>>>   kernel/smp.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/smp.c b/kernel/smp.c
>>> index 2aaac2c47683..07854477c164 100644
>>> --- a/kernel/smp.c
>>> +++ b/kernel/smp.c
>>> @@ -159,8 +159,10 @@ static int generic_exec_single(int cpu, struct
>>> call_single_data *csd,
>>>         }
>>>
>>>
>>> -       if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu))
>>> +       if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu)) {
>>> +               csd_unlock(csd);
>>>                 return -ENXIO;
>>> +       }
>>
>>
>> Acked-by: Ingo Molnar <mingo at kernel.org>
>>
> Tested-by: Guenter Roeck <linux at roeck-us.net>

I've bisected a boot regression on a real system to the same commit
8053871d0f7f ("smp: Fix smp_call_function_single_async() locking").

Linus' patch fixes it, so

Tested-by: Geert Uytterhoeven <geert+renesas at glider.be>

>> Btw., in this case we should probably also generate a WARN_ONCE()
>> warning?
>>
>> I _think_ most such callers calling an SMP function call for offline
>> or out of range CPUs are at minimum racy.
>>
> Not really; at least the online cpu part is an absolutely normal use
> case for qemu-arm.
>
> Sure, you can argue that "this isn't the real system", and that
> qemu-arm should be "fixed", but there are reasons - the emulation
> is (much) slower if the number of CPUs is set to 4, and not everyone
> who wants to use qemu has a system with as many CPUs as the emulated
> system would normally have.

In my case boot failed on r8a73a4/ape6evm, where I had added nodes for all
CPU cores to the .dtsi, while the SoC code doesn't have SMP bringup code yet.
This worked fine before.

With CONFIG_DEBUG_LL=y, the boot hung after:

    Calibrating delay loop (skipped), value calculated using timer
frequency.. 26.00 BogoMIPS (lpj=130000)
    pid_max: default: 32768 minimum: 301
    Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
    Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
    CPU: Testing write buffer coherency: ok
    CPU0: update cpu_capacity 1516
    CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
    Setting up static identity map for 0x40009000 - 0x40009058

With the fix, it continues as expected with:

    Brought up 1 CPUs
    SMP: Total of 1 processors activated (26.00 BogoMIPS).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



More information about the linux-arm-kernel mailing list