[PATCH 5/7] RISC-V: fix auipc-jalr addresses in patched alternatives
Heiko Stübner
heiko at sntech.de
Mon Nov 21 14:17:11 PST 2022
Am Montag, 21. November 2022, 22:31:36 CET schrieb Lad, Prabhakar:
> Hi Heiko,
>
> On Mon, Nov 21, 2022 at 3:06 PM Lad, Prabhakar
> <prabhakar.csengg at gmail.com> wrote:
> >
> > Hi Heiko,
> >
> > On Mon, Nov 21, 2022 at 11:27 AM Heiko Stübner <heiko at sntech.de> wrote:
> > >
> > > Hi,
> > >
> > > Am Montag, 21. November 2022, 10:50:09 CET schrieb Lad, Prabhakar:
> > > > On Thu, Nov 10, 2022 at 4:50 PM Heiko Stuebner <heiko at sntech.de> wrote:
> > > > >
> > > > > From: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > > > >
> > > > > Alternatives live in a different section, so addresses used by call
> > > > > functions will point to wrong locations after the patch got applied.
> > > > >
> > > > > Similar to arm64, adjust the location to consider that offset.
> > > > >
> > > > > Signed-off-by: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > > > > ---
> > >
> > > [...]
> > >
> > > > I have the below assembly code which I have tested without the
> > > > alternatives for the RZ/Five CMO,
> > > >
> > > > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> > > > asm volatile(".option push\n\t\n\t" \
> > > > ".option norvc\n\t" \
> > > > ".option norelax\n\t" \
> > > > "addi sp,sp,-16\n\t" \
> > > > "sd s0,0(sp)\n\t" \
> > > > "sd ra,8(sp)\n\t" \
> > > > "addi s0,sp,16\n\t" \
> > > > "mv a4,%6\n\t" \
> > > > "mv a3,%5\n\t" \
> > > > "mv a2,%4\n\t" \
> > > > "mv a1,%3\n\t" \
> > > > "mv a0,%0\n\t" \
> > > > "call rzfive_cmo\n\t" \
> > > > "ld ra,8(sp)\n\t" \
> > > > "ld s0,0(sp)\n\t" \
> > > > "addi sp,sp,16\n\t" \
> > > > ".option pop\n\t" \
> > > > : : "r"(_cachesize), \
> > > > "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> > > > "r"((unsigned long)(_start) + (_size)), \
> > > > "r"((unsigned long) (_start)), \
> > > > "r"((unsigned long) (_size)), \
> > > > "r"((unsigned long) (_dir)), \
> > > > "r"((unsigned long) (_ops)) \
> > > > : "a0", "a1", "a2", "a3", "a4", "memory")
> > > >
> > > > Now when integrate this with ALTERNATIVE_2() as below,
> > > >
> > > > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> > > > asm volatile(ALTERNATIVE_2( \
> > > > __nops(14), \
> > > > "mv a0, %1\n\t" \
> > > > "j 2f\n\t" \
> > > > "3:\n\t" \
> > > > "cbo." __stringify(_op) " (a0)\n\t" \
> > > > "add a0, a0, %0\n\t" \
> > > > "2:\n\t" \
> > > > "bltu a0, %2, 3b\n\t" \
> > > > __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > > > ".option push\n\t\n\t" \
> > > > ".option norvc\n\t" \
> > > > ".option norelax\n\t" \
> > > > "addi sp,sp,-16\n\t" \
> > > > "sd s0,0(sp)\n\t" \
> > > > "sd ra,8(sp)\n\t" \
> > > > "addi s0,sp,16\n\t" \
> > > > "mv a4,%6\n\t" \
> > > > "mv a3,%5\n\t" \
> > > > "mv a2,%4\n\t" \
> > > > "mv a1,%3\n\t" \
> > > > "mv a0,%0\n\t" \
> > > > "call rzfive_cmo\n\t" \
> > > > "ld ra,8(sp)\n\t" \
> > > > "ld s0,0(sp)\n\t" \
> > > > "addi sp,sp,16\n\t" \
> > > > ".option pop\n\t" \
> > > > , ANDESTECH_VENDOR_ID, \
> > > > ERRATA_ANDESTECH_NO_IOCP, CONFIG_ERRATA_RZFIVE_CMO) \
> > > > : : "r"(_cachesize), \
> > > > "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> > > > "r"((unsigned long)(_start) + (_size)), \
> > > > "r"((unsigned long) (_start)), \
> > > > "r"((unsigned long) (_size)), \
> > > > "r"((unsigned long) (_dir)), \
> > > > "r"((unsigned long) (_ops)) \
> > > > : "a0", "a1", "a2", "a3", "a4", "memory")
> > > >
> > > > I am seeing kernel panic with this change. Looking at the
> > > > riscv_alternative_fix_auipc_jalr() implementation it assumes the rest
> > > > of the alternative options are calls too. Is my understanding correct
> > > > here?
> > >
> > > The loop walks through the instructions after the location got patched and
> > > checks if an instruction is an auipc and the next one is a jalr and only then
> > > adjusts the address accordingly.
> > >
> > Ok so my understanding was wrong here.
> >
> > > So it _should_ leave all other (non auipc+jalr) instructions alone.
> > > (hopefully)
> > >
> > Agreed.
> >
> > >
> > > > Do you think this is the correct approach in my case?
> > >
> > > It does look correct on first glance.
> > >
> > \o/
> >
> > > As I had that passing thought, are you actually calling
> > > riscv_alternative_fix_auipc_jalr()
> > > from your errata/.../foo.c after doing the patching?
> > >
> > > I.e. with the current patchset, that function is only called from the
> > > cpufeature part, but for example not from the other patching locations.
> > > [and a future revision should probably change that :-) ]
> > >
> > >
> > I have made a local copy of riscv_alternative_fix_auipc_jalr() and
> > then calling it after patch_text_nosync() referring to your patch for
> > str functions.
> >
> > > After making sure that function actually runs, the next thing you could try
> > > is to have both the "original" code and the patch be identical, i.e.
> > > replace the cbo* part with your code as well and then just output the
> > > instructions via printk to check what the addresses do in both.
> > >
> > > After riscv_alternative_fix_auipc_jalr() ran then both code variants
> > > should be identical when using the same code in both areas.
> > >
> > So I have added debug prints to match the instructions as below after
> > and before patching:
> >
> > static void riscv_alternative_print_inst(unsigned int *alt_ptr,
> > unsigned int len)
> > {
> > int num_instr = len / sizeof(u32);
> > int i;
> >
> > for (i = 0; i < num_instr; i++)
> > pr_err("%s instruction: 0x%x\n", __func__, *(alt_ptr + i));
> >
> > }
> >
> > void __init_or_module andes_errata_patch_func(struct alt_entry *begin,
> > struct alt_entry *end,
> > unsigned long archid, unsigned long impid,
> > unsigned int stage)
> > {
> > ....
> > if (cpu_req_errata & tmp) {
> > pr_err("stage: %x -> %px--> %x %x %x\n", stage, alt, tmp,
> > cpu_req_errata, alt->errata_id);
> > pr_err("old:%ps alt:%ps len:%lx\n", alt->old_ptr,
> > alt->alt_ptr, alt->alt_len);
> > pr_err("Print old start\n");
> > riscv_alternative_print_inst(alt->old_ptr, alt->alt_len);
> > pr_err("Print old end\n");
> > patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> >
> > riscv_alternative_fix_auipc_jalr(alt->old_ptr, alt->alt_len,
> > alt->old_ptr - alt->alt_ptr);
> > pr_err("Print patch start\n");
> > riscv_alternative_print_inst(alt->alt_ptr, alt->alt_len);
> > pr_err("Print patch end\n");
> > }
> > .....
> > }
> >
> > Below is the log:
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] Print new old end
> > [ 0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> > [ 0.000000] Print patch start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xff010113
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813023
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x113423
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010413
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xf0713
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x78693
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x88613
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x80593
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xe0513
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x97
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xcba080e7
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813083
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13403
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010113
> > [ 0.000000] Print patch end
> > [ 0.000000] stage: 0 -> ffffffff80a2492c--> 1 1 0
> > [ 0.000000] old:arch_sync_dma_for_device
> > alt:riscv_noncoherent_supported len:38
> > [ 0.000000] Print old start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x970013
> > ====================> This instruction doesn't look correct it
> > should be 0x13?
> > [ 0.000000] Print old end
> > [ 0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> > [ 0.000000] Print patch start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xff010113
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813023
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x113423
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010413
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x78713
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x78693
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x88613
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x80593
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xe0513
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x97
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xc82080e7
> > ====================> This instruction doesn't look correct comparing
> > to objdump output this should be 000080e7 or does it require the
> > offset too?
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813083
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13403
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010113
> > [ 0.000000] Print patch end
> > [ 0.000000] stage: 0 -> ffffffff80a24950--> 1 1 0
> > [ 0.000000] old:arch_sync_dma_for_cpu alt:riscv_noncoherent_supported len:38
> > [ 0.000000] Print old start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x97
> > ====================> This instruction doesn't look correct it should
> > be 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xeee080e7
> > ====================> This instruction doesn't look correct it
> > should be 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] Print old end
> > [ 0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> > [ 0.000000] Print patch start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xff010113
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813023
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x113423
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010413
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xf0713
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x80693
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x88613
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x78593
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xe0513
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x97
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xc4a080e7
> > ====================> This instruction doesn't look correct comparing
> > to objdump output this should be 000080e7 or does it require the
> > offset too?
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813083
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13403
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010113
> > [ 0.000000] Print patch end
> > [ 0.000000] stage: 0 -> ffffffff80a24974--> 1 1 0
> > [ 0.000000] old:arch_dma_prep_coherent alt:riscv_noncoherent_supported len:38
> > [ 0.000000] Print old start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x970013
> > ====================> This instruction doesn't look correct it should
> > be 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x80e70000
> > ====================> This instruction doesn't look correct it should
> > be 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xe720
> > ====================> This instruction doesn't look correct it should
> > be 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13
> > [ 0.000000] Print old end
> > [ 0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> > [ 0.000000] Print patch start
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xff010113
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813023
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x113423
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010413
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xf0713
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xe8693
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x88613
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x78593
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x30513
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x97
> > [ 0.000000] riscv_alternative_print_inst instruction: 0xc12080e7
> > ====================> This instruction doesn't look correct comparing
> > to objdump output this should be 000080e7 + offset?
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x813083
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x13403
> > [ 0.000000] riscv_alternative_print_inst instruction: 0x1010113
> > [ 0.000000] Print patch end
> >
> > Here is the output from objdump of the file (dma-noncoherent.o):
> >
> > 000000000000032e <.L888^B1>:
> > 32e: ff010113 addi sp,sp,-16
> > 332: 00813023 sd s0,0(sp)
> > 336: 00113423 sd ra,8(sp)
> > 33a: 01010413 addi s0,sp,16
> > 33e: 000f0713 mv a4,t5
> > 342: 00078693 mv a3,a5
> > 346: 00088613 mv a2,a7
> > 34a: 00080593 mv a1,a6
> > 34e: 000e0513 mv a0,t3
> > 352: 00000097 auipc ra,0x0
> > 356: 000080e7 jalr ra # 352 <.L888^B1+0x24>
> > 35a: 00813083 ld ra,8(sp)
> > 35e: 00013403 ld s0,0(sp)
> > 362: 01010113 addi sp,sp,16
> >
> > 0000000000000366 <.L888^B2>:
> > 366: ff010113 addi sp,sp,-16
> > 36a: 00813023 sd s0,0(sp)
> > 36e: 00113423 sd ra,8(sp)
> > 372: 01010413 addi s0,sp,16
> > 376: 00078713 mv a4,a5
> > 37a: 00078693 mv a3,a5
> > 37e: 00088613 mv a2,a7
> > 382: 00080593 mv a1,a6
> > 386: 000e0513 mv a0,t3
> > 38a: 00000097 auipc ra,0x0
> > 38e: 000080e7 jalr ra # 38a <.L888^B2+0x24>
> > 392: 00813083 ld ra,8(sp)
> > 396: 00013403 ld s0,0(sp)
> > 39a: 01010113 addi sp,sp,16
> >
> > 000000000000039e <.L888^B3>:
> > 39e: ff010113 addi sp,sp,-16
> > 3a2: 00813023 sd s0,0(sp)
> > 3a6: 00113423 sd ra,8(sp)
> > 3aa: 01010413 addi s0,sp,16
> > 3ae: 000f0713 mv a4,t5
> > 3b2: 00080693 mv a3,a6
> > 3b6: 00088613 mv a2,a7
> > 3ba: 00078593 mv a1,a5
> > 3be: 000e0513 mv a0,t3
> > 3c2: 00000097 auipc ra,0x0
> > 3c6: 000080e7 jalr ra # 3c2 <.L888^B3+0x24>
> > 3ca: 00813083 ld ra,8(sp)
> > 3ce: 00013403 ld s0,0(sp)
> > 3d2: 01010113 addi sp,sp,16
> >
> > 00000000000003d6 <.L888^B4>:
> > 3d6: ff010113 addi sp,sp,-16
> > 3da: 00813023 sd s0,0(sp)
> > 3de: 00113423 sd ra,8(sp)
> > 3e2: 01010413 addi s0,sp,16
> > 3e6: 000f0713 mv a4,t5
> > 3ea: 000e8693 mv a3,t4
> > 3ee: 00088613 mv a2,a7
> > 3f2: 00078593 mv a1,a5
> > 3f6: 00030513 mv a0,t1
> > 3fa: 00000097 auipc ra,0x0
> > 3fe: 000080e7 jalr ra # 3fa <.L888^B4+0x24>
> > 402: 00813083 ld ra,8(sp)
> > 406: 00013403 ld s0,0(sp)
> > 40a: 01010113 addi sp,sp,16
> >
> > Disassembly of section __ksymtab_strings:
> >
> > Any pointers what could be happening?
> >
>
> Some more information,
>
> - If I drop the riscv_alternative_fix_auipc_jalr() call after
> patch_text_nosync() and then print the alt->old_ptr instructions
> before patching I can see the instructions as 0x13 (nop) which is
> correct.
>
> - if I call riscv_alternative_fix_auipc_jalr() call after
> patch_text_nosync() and then print the alt->old_ptr instructions
> before patching I dont see 0x13 (nop) consistently for old
> instructions.
which is to be expected I guess.
alt->old_ptr points to the memory location where the live kernel code
lives.
I.e. the code at this location is the thing the kernel actually runs.
The code at this location then gets overwritten by the alternative
assembly.
> - If I replace the nop's in the old instructions with my assembly code
> of rz/five cmo and then just use patch_text_nosync() I can see the
> correct actual instruction being printed apart from jalr (is some sort
> of offset added to it as I see last 4 bits match?) and then is
> replaced correctly by the same alt instructions apart from the jalr
> (log [0]).
>
> - If I replace the nop's in the old instructions with my assembly code
> of rz/five cmo and then use patch_text_nosync() and
> riscv_alternative_fix_auipc_jalr() I can see the actual old
> instructions differs a bit and again the jalr instruction differs too
> in the patched code (log [1]).
>
> [0] https://paste.debian.net/1261412/
> [1] https://paste.debian.net/1261413/
>
> Attached is the objump of dma-noncoherent.o for reference.
I did read that objdumps are not really conclusive when looking
at auipc + jalr instructions, hence the printing of the actual instructions.
As either manually or with a helper like
https://luplab.gitlab.io/rvcodecjs/#q=0xf4c080e7
you can then decode the actual instruction and compare.
In your log the two jalr instructions decode to different offsets,
jalr x1, x1, -180
vs
jalr x1, x1, -834
Can you check what the patch_offset value is in your case?
Interestingly the
auipc x1, 0
is 0 for both cases.
I'll try to build a real test-setup mimicing what you're doing
tomorrow (european tomorrow).
Heiko
More information about the linux-riscv
mailing list