[PATCH v1] RISC-V: take text_mutex during alternative patching

Guenter Roeck linux at roeck-us.net
Wed Feb 15 16:22:13 PST 2023


On 2/15/23 14:00, Heiko Stübner wrote:
> Am Mittwoch, 15. Februar 2023, 15:43:10 CET schrieb Heiko Stübner:
>> Hi Conor,
>>
>> Am Montag, 13. Februar 2023, 23:40:38 CET schrieb Conor Dooley:
>>> +CC Heiko
>>>
>>> On Sun, Feb 12, 2023 at 03:08:55PM -0800, Guenter Roeck wrote:
>>>> On 2/12/23 11:47, Conor Dooley wrote:
>>>>> From: Conor Dooley <conor.dooley at microchip.com>
>>>>>
>>>>> Guenter reported a splat during boot, that Samuel pointed out was the
>>>>> lockdep assertion failing in patch_insn_write():
>>>>>
>>>>> WARNING: CPU: 0 PID: 0 at arch/riscv/kernel/patch.c:63 patch_insn_write+0x222/0x2f6
>>>>> epc : patch_insn_write+0x222/0x2f6
>>>>>    ra : patch_insn_write+0x21e/0x2f6
>>>>> epc : ffffffff800068c6 ra : ffffffff800068c2 sp : ffffffff81803df0
>>>>>    gp : ffffffff81a1ab78 tp : ffffffff81814f80 t0 : ffffffffffffe000
>>>>>    t1 : 0000000000000001 t2 : 4c45203a76637369 s0 : ffffffff81803e40
>>>>>    s1 : 0000000000000004 a0 : 0000000000000000 a1 : ffffffffffffffff
>>>>>    a2 : 0000000000000004 a3 : 0000000000000000 a4 : 0000000000000001
>>>>>    a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000052464e43
>>>>>    s2 : ffffffff80b4889c s3 : 000000000000082c s4 : ffffffff80b48828
>>>>>    s5 : 0000000000000828 s6 : ffffffff8131a0a0 s7 : 0000000000000fff
>>>>>    s8 : 0000000008000200 s9 : ffffffff8131a520 s10: 0000000000000018
>>>>>    s11: 000000000000000b t3 : 0000000000000001 t4 : 000000000000000d
>>>>>    t5 : ffffffffd8180000 t6 : ffffffff81803bc8
>>>>> status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
>>>>> [<ffffffff800068c6>] patch_insn_write+0x222/0x2f6
>>>>> [<ffffffff80006a36>] patch_text_nosync+0xc/0x2a
>>>>> [<ffffffff80003b86>] riscv_cpufeature_patch_func+0x52/0x98
>>>>> [<ffffffff80003348>] _apply_alternatives+0x46/0x86
>>>>> [<ffffffff80c02d36>] apply_boot_alternatives+0x3c/0xfa
>>>>> [<ffffffff80c03ad8>] setup_arch+0x584/0x5b8
>>>>> [<ffffffff80c0075a>] start_kernel+0xa2/0x8f8
>>>>>
>>>>> This issue was exposed by 702e64550b12 ("riscv: fpu: switch has_fpu() to
>>>>> riscv_has_extension_likely()"), as it is the patching in has_fpu() that
>>>>> triggers the splats in Guenter's report.
>>>>>
>>>>> Take the text_mutex before doing any code patching to satisfy lockdep.
>>>>>
>>>>> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
>>>>> Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head")
>>>>> Fixes: 1a0e5dbd3723 ("riscv: sifive: Add SiFive alternative ports")
>>>>> Reported-by: Guenter Roeck <linux at roeck-us.net>
>>>>> Link: https://lore.kernel.org/all/20230212154333.GA3760469@roeck-us.net/
>>>>> Signed-off-by: Conor Dooley <conor.dooley at microchip.com>
>>>>
>>>>
>>>> Applying this patch together with
>>>>
>>>> riscv: Fix early alternative patching
>>>> riscv: Fix Zbb alternative IDs
>>>>
>>>> _and_ disabling CONFIG_RISCV_ISA_ZBB fixes all problems observed with
>>>> riscv64 for me. With that,
>>>>
>>>> Tested-by: Guenter Roeck <linux at roeck-us.net>
>>>>
>>>> Unfortunately I still see boot failures when trying to boot riscv32
>>>> images, so something is still broken. I don't know yet if that is
>>>> a generic problem or if it is related to a riscv patch.
>>>> I am still trying to track that down ...
>>>
>>> Heiko, would you be able to look at this please?
>>> There's some more information in the Link: above.
>>
>> just for the record, I started looking into this now.
>>
>> Starting with my own code [0] on top of 6.2-rc2 I was able to verify
>> - qemu-riscv32 without zbb
>> - qemu-riscv32 with zbb
>> - qemu-riscv64 without zbb
>> - qemu-riscv64 with zbb
>> to work.
>>
>> So next up, I'll move over to the merged for-next branch to hopefully
>> see what changed between the above and the for-next code.
> 
> So now I've also tested Palmer's for-next at
>    commit ec6311919ea6 ("Merge patch series "riscv: Optimize function trace"")
> 
> again with the same variants
> - qemu-riscv32 without zbb
> - qemu-riscv32 with zbb
> - qemu-riscv64 without zbb
> - qemu-riscv64 with zbb
> 
> And all of them booted fine into a nfs-root (debian for riscv64 and a
> buildroot for riscv32).
> 
> I even forced a bug into the zbb code to make sure the patching worked
> correctly (where the kernel failed as expected).
> 
> Qemu-version for me was 7.2.50 (v7.2.0-744-g5a3633929a-dirty)
> 

Where did you get this version of qemu from ? Does it posssibly include
zbb related fixes/changes on top of v7.2.0 ? I used v7.2.0. I also tried
the master branch as of today, but there was no difference.

Also, what C compiler and binutils version did you use ?

> I did try the one from Debian-stable (qemu-5.2) but that was too old and
> didn't support Zbb yet.
> 
> One thing of note, the "active" 32bit config I had, somehow didn't produce
> working images and I needed to start a new build using the rv32_defconfig.
> 

My tests use rv32_defconfig as base for 32-bit tests, with various debug
options enabled on top.

Ok, I gave it another try. The backtraces do _not_ appear if I just
use defconfig as base. I'll try to find out what exactly triggers the
problem.

Thanks,
Guenter

> So right now, I'm not sure what more to test though.
> 
> Heiko
> 
> 




More information about the linux-riscv mailing list