[PATCH v1] RISC-V: take text_mutex during alternative patching

Heiko Stübner heiko at sntech.de
Wed Feb 22 12:33:47 PST 2023


Am Mittwoch, 22. Februar 2023, 20:06:56 CET schrieb Guenter Roeck:
> On 2/22/23 09:43, Heiko Stübner wrote:
> > Am Mittwoch, 22. Februar 2023, 18:03:55 CET schrieb Heiko Stübner:
> >> Am Mittwoch, 22. Februar 2023, 17:47:10 CET schrieb Palmer Dabbelt:
> >>> On Wed, 22 Feb 2023 07:39:11 PST (-0800), heiko at sntech.de wrote:
> >>>> Am Mittwoch, 22. Februar 2023, 16:31:25 CET schrieb Andrew Jones:
> >>>>> On Wed, Feb 22, 2023 at 03:27:43PM +0100, Andrew Jones wrote:
> >>>>>> On Thu, Feb 16, 2023 at 07:33:55AM -0800, Guenter Roeck wrote:
> >>>>>>> On 2/15/23 14:00, Heiko Stübner wrote:
> >>>>>>> [ ... ]
> >>>>>>>>
> >>>>>>>> So now I've also tested Palmer's for-next at
> >>>>>>>>     commit ec6311919ea6 ("Merge patch series "riscv: Optimize function trace"")
> >>>>>>>>
> >>>>>>>> again with the same variants
> >>>>>>>> - qemu-riscv32 without zbb
> >>>>>>>> - qemu-riscv32 with zbb
> >>>>>>>> - qemu-riscv64 without zbb
> >>>>>>>> - qemu-riscv64 with zbb
> >>>>>>>>
> >>>>>>>> And all of them booted fine into a nfs-root (debian for riscv64 and a
> >>>>>>>> buildroot for riscv32).
> >>>>>>>>
> >>>>>>>> I even forced a bug into the zbb code to make sure the patching worked
> >>>>>>>> correctly (where the kernel failed as expected).
> >>>>>>>>
> >>>>>>>> Qemu-version for me was 7.2.50 (v7.2.0-744-g5a3633929a-dirty)
> >>>>>>>>
> >>>>>>>> I did try the one from Debian-stable (qemu-5.2) but that was too old and
> >>>>>>>> didn't support Zbb yet.
> >>>>>>>>
> >>>>>>>> One thing of note, the "active" 32bit config I had, somehow didn't produce
> >>>>>>>> working images and I needed to start a new build using the rv32_defconfig.
> >>>>>>>>
> >>>>>>>> So right now, I'm not sure what more to test though.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Another example:
> >>>>>>>
> >>>>>>> - build defconfig
> >>>>>>> - run
> >>>>>>>    qemu-system-riscv64 -M virt -m 512M -no-reboot -kernel arch/riscv/boot/Image \
> >>>>>>>      -snapshot -device virtio-blk-device,drive=d0 -drive file=rootfs.ext2,if=none,id=d0,format=raw \
> >>>>>>>      -append "root=/dev/vda console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200" \
> >>>>>>>      -nographic -monitor none
> >>>>>>>
> >>>>>>> With CONFIG_RISCV_ISA_ZBB=y, that results in
> >>>>>>>
> >>>>>>> [    0.818263] /dev/root: Can't open blockdev
> >>>>>>> [    0.818856] VFS: Cannot open root device "/dev/vda" or unknown-block(0,0): error -6
> >>>>>>> [    0.819177] Please append a correct "root=" boot option; here are the available partitions:
> >>>>>>> [    0.819808] fe00           16384 vda
> >>>>>>> [    0.819944]  driver: virtio_blk
> >>>>>>> [    0.820534] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
> >>>>>>> [    0.821101] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc8-next-20230216-00002-g80332825e240 #4
> >>>>>>> [    0.821672] Hardware name: riscv-virtio,qemu (DT)
> >>>>>>> [    0.822050] Call Trace:
> >>>>>>> [    0.822427] [<ffffffff800053e4>] dump_backtrace+0x1c/0x24
> >>>>>>> [    0.822834] [<ffffffff807f90e4>] show_stack+0x2c/0x38
> >>>>>>> [    0.823085] [<ffffffff80803aea>] dump_stack_lvl+0x3c/0x54
> >>>>>>> [    0.823351] [<ffffffff80803b16>] dump_stack+0x14/0x1c
> >>>>>>> [    0.823601] [<ffffffff807f944c>] panic+0x102/0x29e
> >>>>>>> [    0.823834] [<ffffffff80a015e2>] mount_block_root+0x18c/0x23e
> >>>>>>> [    0.824148] [<ffffffff80a0187c>] mount_root+0x1e8/0x218
> >>>>>>> [    0.824398] [<ffffffff80a019ee>] prepare_namespace+0x142/0x184
> >>>>>>> [    0.824655] [<ffffffff80a01182>] kernel_init_freeable+0x236/0x25a
> >>>>>>> [    0.824934] [<ffffffff80804602>] kernel_init+0x1e/0x10a
> >>>>>>> [    0.825201] [<ffffffff80003560>] ret_from_exception+0x0/0x16
> >>>>>>> [    0.826180] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---
> >>>>>>>
> >>>>>>> This works fine if CONFIG_RISCV_ISA_ZBB is not enabled.
> >>>>>>>
> >>>>>>> Tested with gcc 11.3, binutils 2.39, qemu v7.2.0 and qemu built from mainline.
> >>>>>>>
> >>>>>>
> >>>>>> Just to +1 this, I get the same result (unable to mount root fs) with
> >>>>>>
> >>>>>> $QEMU -cpu rv64,zbb=on \
> >>>>>>          -nographic \
> >>>>>>          -machine virt \
> >>>>>>          -kernel $KERNEL \
> >>>>>>          -append 'root=/dev/vda console=ttyS0' \
> >>>>>>          -drive file=disk.ext4,format=raw,id=hd0 \
> >>>>>>          -device virtio-blk-pci,drive=hd0
> >>>>>>
> >>>>>> kernel:   latest riscv-linux/for-next (8658db0a4a0f), defconfig
> >>>>>> gcc:      riscv-gnu-toolchain (12.1.0)
> >>>>>> binutils: riscv-gnu-toolchain (2.39)
> >>>>>> qemu:     latest master (79b677d658d3)
> >>>>>>
> >>>>>> Flipping the QEMU cpu zbb property off allows boot to succeed, i.e. it's
> >>>>>> not necessary to compile out the CONFIG_RISCV_ISA_ZBB code from the
> >>>>>> kernel, it's just necessary to avoid using it.
> >>>>>>
> >>>>>
> >>>>> Looks like something in the strncmp implementation. Only commenting it
> >>>>> out allows boot to succeed.
> >>>>
> >>>> and interestingly it seems to be something very specific. I.e. my setup is
> >>>> nfsroot-based (qemu is "just" another board in my boardfarm) and booting
> >>>> into an nfs-root works quite nicely.
> >>>>
> >>>> I guess I need to look into how to get an actual disk-image in there.
> >>>
> >>> It looks like Drew isn't using an initrd (and with NFS, presumably you
> >>> are)?  That's probably a big difference as well.
> >>
> >> There is no initrd involved in my qemu setup ;-) .
> >> Just a kernel built from current for-next + defconfig and a nfs-server
> >> holding a full Debian-riscv64 system.
> >>
> >>
> >> The magic qemu incantation is also pretty standard:
> >>
> >> /usr/local/bin/qemu-system-riscv64 -M virt -smp 2 -m 1G -display none \
> >>   -cpu rv64,zbb=true,zbc=true,svpbmt=true,Zicbom=true,Zawrs=true,sscofpmf=true,v=true \
> >>   -serial telnet:localhost:5500,server,nowait -kernel /home/devel/nfs/kernel/riscv64/Image \
> >>   -append earlycon=sbi root=/dev/nfs nfsroot=10.0.2.2:/home/devel/nfs/rootfs-riscv64virt ip=dhcp rw \
> >>   -netdev user,id=n1 -device virtio-net-pci,netdev=n1 -name boardfarm-0,process=boardfarm-vm-0 -daemonize
> >>
> >>
> >> And I also checked that my kernel really lands in the Zbb-code by simply
> >> adding a "ret" [0] and making sure it breaks vs. runs without it
> >>
> >>
> >> Next I'll try to move my rootfs into a real image-file and switch over to
> >> the commandline Drew calls, to see if it'll reproduce then.
> > 
> > With the rootfs in an image I could reproduce the issue now.
> > 
> > I hacked in a very crude change to find the invocation that acts differently
> > and voila it's a strncmp of the "/dev/vda" against "/dev/" [0].
> > 
> > It's really strange that all the other invocations produce correct results.
> > 
> 
> Not really; the failures are the only comparisons which are expected to
> return 0 and are not a complete string match.
> 
> I'd suspect that strncmp(s1, s2, strlen(s2)) will always return
> a bad result if s1 starts with s2 but is not identical to s2.

The issue seems to come into play when s2 is shorter than the stepsize
for the bitops-accelerated part (reg-size being 8) here.

I.e. 

	strncmp("/dev/vda", "/dev/", 5): 1
	strncmp("/dev/vda", "/dev", 4): 1
	strncmp("/dev/vda", "/de", 3): 1
	...
but
	strncmp("/dev/vda12", "/dev/vda", 8): 0
	strncmp("/dev/vda12", "/dev/vda1", 9): 0

For size-9 this succeeds because the first step is "/dev/vda" and the
0byte in "12" is detected and jumps to processing the rest individually.

Duplicating the 0-byte check for s2 [1] - i.e. jumping to the byte-wise
path when a 0-byte is detected in the second string - solves the issue.

Not sure if there is a more performant solution for this, because we do
have the length available. I'll think some more about that part.


Heiko


[1]
@@ -83,6 +83,8 @@ strncmp_zbb:
        REG_L   t1, 0(a1)
        orc.b   t3, t0
        bne     t3, t5, 2f
+       orc.b   t3, t1
+       bne     t3, t5, 2f
        addi    a0, a0, SZREG
        addi    a1, a1, SZREG
        beq     t0, t1, 1b



> > [0]
> > non-working - with zbb-strncmp:
> > [    2.619129] strncmp: comparing /dev/vda <-> mtd, count 3 => -1
> > [    2.620451] strncmp: comparing /dev/vda <-> ubi, count 3 => -1
> > [    2.621163] strncmp: comparing /dev/vda <-> PARTUUID=, count 9 => -1
> > [    2.621703] strncmp: comparing /dev/vda <-> PARTLABEL=, count 10 => -1
> > ------ below is the difference
> > [    2.622255] strncmp: comparing /dev/vda <-> /dev/, count 5 => 1
> > [    2.623446] strncmp: comparing /dev/vda <-> /dev/, count 5 => 1
> > ------ above is the difference
> > [    2.627064] strncmp: comparing sysfs <-> ext3, count 4 => 14
> > [    2.627646] strncmp: comparing tmpfs <-> ext3, count 4 => 15
> > [    2.628476] strncmp: comparing bdev <-> ext3, count 4 => -3
> > [    2.628990] strncmp: comparing proc <-> ext3, count 4 => 11
> > [    2.629422] strncmp: comparing cgroup <-> ext3, count 4 => -2
> > [    2.629856] strncmp: comparing cgroup2 <-> ext3, count 4 => -2
> > [    2.630345] strncmp: comparing cpuset <-> ext3, count 4 => -2
> > [    2.630785] strncmp: comparing devtmpfs <-> ext3, count 4 => -1
> > [    2.631267] strncmp: comparing debugfs <-> ext3, count 4 => -1
> > [    2.631696] strncmp: comparing securityfs <-> ext3, count 4 => 1
> > [    2.632596] strncmp: comparing sockfs <-> ext3, count 4 => 14
> > [    2.633095] strncmp: comparing bpf <-> ext3, count 4 => -3
> > [    2.633600] strncmp: comparing pipefs <-> ext3, count 4 => 11
> > [    2.634071] strncmp: comparing ramfs <-> ext3, count 4 => 13
> > [    2.634501] strncmp: comparing hugetlbfs <-> ext3, count 4 => 1
> > [    2.634955] strncmp: comparing rpc_pipefs <-> ext3, count 4 => 1
> > [    2.635407] strncmp: comparing devpts <-> ext3, count 4 => -1
> > [    2.638788] strncmp: comparing ext3 <-> ext3, count 4 => 0
> > 
> > 
> > working - with normal strncmp:
> > [    2.416438] strncmp: comparing /dev/vda <-> mtd, count 3 => -62
> > [    2.417312] strncmp: comparing /dev/vda <-> ubi, count 3 => -70
> > [    2.418296] strncmp: comparing /dev/vda <-> PARTUUID=, count 9 => -33
> > [    2.418921] strncmp: comparing /dev/vda <-> PARTLABEL=, count 10 => -33
> > ------ below is the difference
> > [    2.419450] strncmp: comparing /dev/vda <-> /dev/, count 5 => 0
> > [    2.420698] strncmp: comparing /dev/vda <-> /dev/, count 5 => 0
> > ------ above is the difference
> > [    2.424086] strncmp: comparing sysfs <-> ext3, count 4 => 14
> > [    2.424663] strncmp: comparing tmpfs <-> ext3, count 4 => 15
> > [    2.425111] strncmp: comparing bdev <-> ext3, count 4 => -3
> > [    2.425563] strncmp: comparing proc <-> ext3, count 4 => 11
> > [    2.426022] strncmp: comparing cgroup <-> ext3, count 4 => -2
> > [    2.426717] strncmp: comparing cgroup2 <-> ext3, count 4 => -2
> > [    2.427175] strncmp: comparing cpuset <-> ext3, count 4 => -2
> > [    2.427608] strncmp: comparing devtmpfs <-> ext3, count 4 => -1
> > [    2.428061] strncmp: comparing debugfs <-> ext3, count 4 => -1
> > [    2.428526] strncmp: comparing securityfs <-> ext3, count 4 => 14
> > [    2.428985] strncmp: comparing sockfs <-> ext3, count 4 => 14
> > [    2.429423] strncmp: comparing bpf <-> ext3, count 4 => -3
> > [    2.429863] strncmp: comparing pipefs <-> ext3, count 4 => 11
> > [    2.430433] strncmp: comparing ramfs <-> ext3, count 4 => 13
> > [    2.431202] strncmp: comparing hugetlbfs <-> ext3, count 4 => 3
> > [    2.431721] strncmp: comparing rpc_pipefs <-> ext3, count 4 => 13
> > [    2.432222] strncmp: comparing devpts <-> ext3, count 4 => -1
> > [    2.432685] strncmp: comparing ext3 <-> ext3, count 4 => 0
> > 
> > 
> > 
> 
> 







More information about the linux-riscv mailing list