[PATCH v5 00/13] riscv: improve boot time isa extensions handling
Guenter Roeck
linux at roeck-us.net
Sun Feb 12 14:21:59 PST 2023
On 2/12/23 12:39, Conor Dooley wrote:
> On Sun, Feb 12, 2023 at 12:27:10PM -0800, Guenter Roeck wrote:
>> On 2/12/23 10:45, Conor Dooley wrote:
>> ...
>>>
>>>> However, I still see that the patch series
>>>> results in boot hangs with the sifive_u qemu emulation, where
>>>> the log ends with "Oops - illegal instruction". Is that problem
>>>> being addressed as well ?
>>>
>>> Hmm, if it died on the last commit in this series, then I am not sure.
>>> If you meant with riscv/for-next or linux-next that's fixed by a patch
>>> from Samuel:
>>> https://patchwork.kernel.org/project/linux-riscv/patch/20230212021534.59121-3-samuel@sholland.org/
>>>
>>
>> It failed after the merge, so it looks like it may have been merge damage.
>>
>> Anyway, I applied
>>
>> RISC-V: Don't check text_mutex during stop_machine
>
> That being:
> https://lore.kernel.org/all/20220322022331.32136-1-palmer@rivosinc.com/
> Which handles the lockdep assertion during stop_machine...
>
>> riscv: Fix early alternative patching
>> riscv: Fix Zbb alternative IDs
>>
>> and the sifive_u emulation no longer crashes. However, I still get
>>
>> [ 0.000000] ------------[ cut here ]------------
>> [ 0.000000] WARNING: CPU: 0 PID: 0 at arch/riscv/kernel/patch.c:71 patch_insn_write+0x222/0x2f6
>
> ...but doesn't prevent the early "spam" of assertion failures from the
> code patching for alternatives. I sent a patch to take the lock during
> the alternative patching which should get rid of them for you. It did
> for me at least!
> https://lore.kernel.org/all/20230212194735.491785-1-conor@kernel.org
>
>> repeated several times.
>>
>> I then also tested
>>
>> riscv: patch: Fixup lockdep warning in stop_machine
>
> This one just deletes the lockdep check, so I would expect it to remove
> the complaints.
>
>> riscv: Fix early alternative patching
>> riscv: Fix Zbb alternative IDs
>>
>> which works fine (no warning backtrace) for sifive_u, but gives me
>>
>> WARNING: CPU: 0 PID: 0 at kernel/trace/trace_events.c:433 trace_event_raw_init+0xde/0x642
>
> Hmm, do you have the full splat for this one handy?
>
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/trace_events.c:433 trace_event_raw_init+0xde/0x642
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.2.0-rc7-next-20230210 #1
[ 0.000000] Hardware name: riscv-virtio,qemu (DT)
[ 0.000000] epc : trace_event_raw_init+0xde/0x642
[ 0.000000] ra : trace_event_raw_init+0x45a/0x642
[ 0.000000] epc : ffffffff8010571a ra : ffffffff80105a96 sp : ffffffff81803e60
[ 0.000000] gp : ffffffff81a1ab78 tp : ffffffff81814f80 t0 : 0000000000000000
[ 0.000000] t1 : 5245432d3e000000 t2 : 0000000000000000 s0 : ffffffff81803f20
[ 0.000000] s1 : 000000000000045f a0 : 0000000000000000 a1 : ffffffff81331ef0
[ 0.000000] a2 : 000000000000025c a3 : 0000000000000001 a4 : ffffffff801056fa
[ 0.000000] a5 : 000000000000002c a6 : ffffffff8192e4d8 a7 : ffffffff81157a90
[ 0.000000] s2 : 0000000000000000 s3 : ffffffff81922870 s4 : ffffffff8192e4d8
[ 0.000000] s5 : ffffffff81011c30 s6 : 000000000000000a s7 : 0000000000000021
[ 0.000000] s8 : 000000000000005c s9 : ffffffff81331ee8 s10: 0000000000000001
[ 0.000000] s11: 0000000000000000 t3 : 0000000000000007 t4 : 0000000000000070
[ 0.000000] t5 : 0000000000000025 t6 : 0000000000000009
[ 0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
[ 0.000000] [<ffffffff8010571a>] trace_event_raw_init+0xde/0x642
[ 0.000000] [<ffffffff80104d32>] event_init+0x28/0x84
[ 0.000000] [<ffffffff80c0f7ca>] trace_event_init+0x9e/0x2ae
[ 0.000000] [<ffffffff80c0f3a0>] trace_init+0x10/0x18
[ 0.000000] [<ffffffff80c00bc6>] start_kernel+0x50e/0x8f8
[ 0.000000] irq event stamp: 0
[ 0.000000] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 0.000000] hardirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.000000] softirqs last enabled at (0): [<0000000000000000>] 0x0
[ 0.000000] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] event btrfs_clear_extent_bit has unsafe dereference of argument 1
[ 0.000000] print_fmt: "%pU: io_tree=%s ino=%llu root=%llu start=%llu len=%llu clear_bits=%s", REC->fsid, __print_symbolic(REC->owner, {IO_TREE_FS_PINNED_EXTENTS, "PINNED_EXTENTS"}, {IO_TREE_FS_EXCLUDED_EXTENTS, "EXCLUDED_EXTENTS"}, {IO_TREE_BTREE_INODE_IO, "BTREE_INODE_IO"}, {IO_TREE_INODE_IO, "INODE_IO"}, {IO_TREE_RELOC_BLOCKS, "RELOC_BLOCKS"}, {IO_TREE_TRANS_DIRTY_PAGES, "TRANS_DIRTY_PAGES"}, {IO_TREE_ROOT_DIRTY_LOG_PAGES, "ROOT_DIRTY_LOG_PAGES"}, {IO_TREE_INODE_FILE_EXTENT, "INODE_FILE_EXTENT"}, {IO_TREE_LOG_CSUM_RANGE, "LOG_CSUM_RANGE"}, {IO_TREE_SELFTEST, "SELFTEST"}), REC->ino, REC->rootid, REC->start, REC->len, __print_flags(REC->clear_bits, "|", { EXTENT_DIRTY, "DIRTY"}, { EXTENT_UPTODATE, "UPTODATE"}, { EXTENT_LOCKED, "LOCKED"}, { EXTENT_NEW, "NEW"}, { EXTENT_DELALLOC, "DELALLOC"}, { EXTENT_DEFRAG, "DEFRAG"}, { EXTENT_BOUNDARY, "BOUNDARY"}, { EXTENT_NODATASUM, "NODATASUM"}, { EXTENT_CLEAR_META_RESV, "CLEAR_META_RESV"}, { EXTENT_NEED_WAIT, "NEED_WAIT"}, { EXTENT_NORESERVE,
"NORESERVE"}, { EXTENT_QGROUP_RESERV
[ 0.000000] event btrfs_ordered_sched has unsafe dereference of argument 1
[ 0.000000] print_fmt: "%pU: work=%p (normal_work=%p) wq=%p func=%ps ordered_func=%p ordered_free=%p", REC->fsid, REC->work, REC->normal_work, REC->wq, REC->func, REC->ordered_func, REC->ordered_free
[ 0.000000] event btrfs_work_sched has unsafe dereference of argument 1
[ 0.000000] print_fmt: "%pU: work=%p (normal_work=%p) wq=%p func=%ps ordered_func=%p ordered_free=%p", REC->fsid, REC->work, REC->normal_work, REC->wq, REC->func, REC->ordered_func, REC->ordered_free
[ 0.000000] event btrfs_work_queued has unsafe dereference of argument 1
[ 0.000000] print_fmt: "%pU: work=%p (normal_work=%p) wq=%p func=%ps ordered_func=%p ordered_free=%p", REC->fsid, REC->work, REC->normal_work, REC->wq, REC->func, REC->ordered_func, REC->ordered_free
[ 0.000000] event find_free_extent_search_loop has unsafe dereference of argument 1
and so on.
It bisects to "RISC-V: add zbb support to string functions", which also seems
to cause various boot failures. Unfortunately that patch is difficult to revert,
but marking TOOLCHAIN_HAS_ZBB as broken "fixes" it. I don't know if there is
a problem with the patch or with qemu. I'll disable RISCV_ISA_ZBB in my tests
for the time being to work around it.
Guenter
>> and a whole lot of
>>
>> event btrfs_clear_extent_bit has unsafe dereference of argument 1
>>
>> and similar messages when running the "virt" emulation. That was there before,
>> but drowned in the noise. Ok, guess I'll need another round of bisect.
>
> Thanks for all of your testing :)
>
More information about the linux-riscv
mailing list