[PATCH 5/7] RISC-V: fix auipc-jalr addresses in patched alternatives

Lad, Prabhakar prabhakar.csengg at gmail.com
Mon Nov 21 13:31:36 PST 2022


Hi Heiko,

On Mon, Nov 21, 2022 at 3:06 PM Lad, Prabhakar
<prabhakar.csengg at gmail.com> wrote:
>
> Hi Heiko,
>
> On Mon, Nov 21, 2022 at 11:27 AM Heiko Stübner <heiko at sntech.de> wrote:
> >
> > Hi,
> >
> > Am Montag, 21. November 2022, 10:50:09 CET schrieb Lad, Prabhakar:
> > > On Thu, Nov 10, 2022 at 4:50 PM Heiko Stuebner <heiko at sntech.de> wrote:
> > > >
> > > > From: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > > >
> > > > Alternatives live in a different section, so addresses used by call
> > > > functions will point to wrong locations after the patch got applied.
> > > >
> > > > Similar to arm64, adjust the location to consider that offset.
> > > >
> > > > Signed-off-by: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > > > ---
> >
> > [...]
> >
> > > I have the below assembly code which I have tested without the
> > > alternatives for the RZ/Five CMO,
> > >
> > > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops)        \
> > > asm volatile(".option push\n\t\n\t"                    \
> > >          ".option norvc\n\t"                    \
> > >          ".option norelax\n\t"                    \
> > >          "addi sp,sp,-16\n\t"                    \
> > >          "sd    s0,0(sp)\n\t"                    \
> > >          "sd    ra,8(sp)\n\t"                    \
> > >          "addi    s0,sp,16\n\t"                    \
> > >          "mv a4,%6\n\t"                        \
> > >          "mv a3,%5\n\t"                        \
> > >          "mv a2,%4\n\t"                        \
> > >          "mv a1,%3\n\t"                        \
> > >          "mv a0,%0\n\t"                        \
> > >          "call rzfive_cmo\n\t"                    \
> > >          "ld    ra,8(sp)\n\t"                    \
> > >          "ld    s0,0(sp)\n\t"                    \
> > >          "addi    sp,sp,16\n\t"                    \
> > >          ".option pop\n\t"                        \
> > >          : : "r"(_cachesize),                    \
> > >          "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),    \
> > >          "r"((unsigned long)(_start) + (_size)),            \
> > >          "r"((unsigned long) (_start)),                \
> > >          "r"((unsigned long) (_size)),                \
> > >          "r"((unsigned long) (_dir)),                \
> > >          "r"((unsigned long) (_ops))                \
> > >          : "a0", "a1", "a2", "a3", "a4", "memory")
> > >
> > > Now when integrate this with ALTERNATIVE_2() as below,
> > >
> > > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops)        \
> > > asm volatile(ALTERNATIVE_2(                        \
> > >     __nops(14),                            \
> > >     "mv a0, %1\n\t"                            \
> > >     "j 2f\n\t"                            \
> > >     "3:\n\t"                            \
> > >     "cbo." __stringify(_op) " (a0)\n\t"                \
> > >     "add a0, a0, %0\n\t"                        \
> > >     "2:\n\t"                            \
> > >     "bltu a0, %2, 3b\n\t"                        \
> > >     __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM,    \
> > >     ".option push\n\t\n\t"                        \
> > >     ".option norvc\n\t"                        \
> > >     ".option norelax\n\t"                        \
> > >     "addi sp,sp,-16\n\t"                        \
> > >     "sd    s0,0(sp)\n\t"                        \
> > >     "sd    ra,8(sp)\n\t"                        \
> > >     "addi    s0,sp,16\n\t"                        \
> > >     "mv a4,%6\n\t"                            \
> > >     "mv a3,%5\n\t"                            \
> > >     "mv a2,%4\n\t"                            \
> > >     "mv a1,%3\n\t"                            \
> > >     "mv a0,%0\n\t"                            \
> > >     "call rzfive_cmo\n\t"                \
> > >     "ld    ra,8(sp)\n\t"                        \
> > >     "ld    s0,0(sp)\n\t"                        \
> > >     "addi    sp,sp,16\n\t"                        \
> > >     ".option pop\n\t"                        \
> > >     , ANDESTECH_VENDOR_ID,                        \
> > >             ERRATA_ANDESTECH_NO_IOCP, CONFIG_ERRATA_RZFIVE_CMO)    \
> > >     : : "r"(_cachesize),                        \
> > >     "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),    \
> > >     "r"((unsigned long)(_start) + (_size)),            \
> > >     "r"((unsigned long) (_start)),                \
> > >     "r"((unsigned long) (_size)),                \
> > >     "r"((unsigned long) (_dir)),                \
> > >     "r"((unsigned long) (_ops))                \
> > >     : "a0", "a1", "a2", "a3", "a4", "memory")
> > >
> > > I am seeing kernel panic with this change. Looking at the
> > > riscv_alternative_fix_auipc_jalr() implementation it assumes the rest
> > > of the alternative options are calls too. Is my understanding correct
> > > here?
> >
> > The loop walks through the instructions after the location got patched and
> > checks if an instruction is an auipc and the next one is a jalr and only then
> > adjusts the address accordingly.
> >
> Ok so my understanding was wrong here.
>
> > So it _should_ leave all other (non auipc+jalr) instructions alone.
> > (hopefully)
> >
> Agreed.
>
> >
> > > Do you think this is the correct approach in my case?
> >
> > It does look correct on first glance.
> >
> \o/
>
> > As I had that passing thought, are you actually calling
> >         riscv_alternative_fix_auipc_jalr()
> > from your errata/.../foo.c after doing the patching?
> >
> > I.e. with the current patchset, that function is only called from the
> > cpufeature part, but for example not from the other patching locations.
> > [and a future revision should probably change that :-) ]
> >
> >
> I have made a local copy of riscv_alternative_fix_auipc_jalr() and
> then calling it after patch_text_nosync() referring to your patch for
> str functions.
>
> > After making sure that function actually runs, the next thing you could try
> > is to have both the "original" code and the patch be identical, i.e.
> > replace the cbo* part with your code as well and then just output the
> > instructions via printk to check what the addresses do in both.
> >
> > After riscv_alternative_fix_auipc_jalr() ran then both code variants
> > should be identical when using the same code in both areas.
> >
> So I have added debug prints to match the instructions as below after
> and before patching:
>
> static void riscv_alternative_print_inst(unsigned int *alt_ptr,
>                      unsigned int len)
> {
>     int num_instr = len / sizeof(u32);
>     int i;
>
>     for (i = 0; i < num_instr; i++)
>         pr_err("%s instruction: 0x%x\n", __func__, *(alt_ptr + i));
>
> }
>
> void __init_or_module andes_errata_patch_func(struct alt_entry *begin,
> struct alt_entry *end,
>                           unsigned long archid, unsigned long impid,
>                           unsigned int stage)
> {
> ....
>     if (cpu_req_errata & tmp) {
>         pr_err("stage: %x -> %px--> %x %x %x\n", stage, alt, tmp,
> cpu_req_errata, alt->errata_id);
>         pr_err("old:%ps alt:%ps len:%lx\n", alt->old_ptr,
> alt->alt_ptr, alt->alt_len);
>         pr_err("Print old start\n");
>         riscv_alternative_print_inst(alt->old_ptr, alt->alt_len);
>         pr_err("Print old end\n");
>         patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
>
>         riscv_alternative_fix_auipc_jalr(alt->old_ptr, alt->alt_len,
>                         alt->old_ptr - alt->alt_ptr);
>         pr_err("Print patch start\n");
>         riscv_alternative_print_inst(alt->alt_ptr, alt->alt_len);
>         pr_err("Print patch end\n");
>     }
> .....
> }
>
> Below is the log:
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] Print new old end
> [    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> [    0.000000] Print patch start
> [    0.000000] riscv_alternative_print_inst instruction: 0xff010113
> [    0.000000] riscv_alternative_print_inst instruction: 0x813023
> [    0.000000] riscv_alternative_print_inst instruction: 0x113423
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010413
> [    0.000000] riscv_alternative_print_inst instruction: 0xf0713
> [    0.000000] riscv_alternative_print_inst instruction: 0x78693
> [    0.000000] riscv_alternative_print_inst instruction: 0x88613
> [    0.000000] riscv_alternative_print_inst instruction: 0x80593
> [    0.000000] riscv_alternative_print_inst instruction: 0xe0513
> [    0.000000] riscv_alternative_print_inst instruction: 0x97
> [    0.000000] riscv_alternative_print_inst instruction: 0xcba080e7
> [    0.000000] riscv_alternative_print_inst instruction: 0x813083
> [    0.000000] riscv_alternative_print_inst instruction: 0x13403
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010113
> [    0.000000] Print patch end
> [    0.000000] stage: 0 -> ffffffff80a2492c--> 1 1 0
> [    0.000000] old:arch_sync_dma_for_device
> alt:riscv_noncoherent_supported len:38
> [    0.000000] Print  old start
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x970013
>     ====================> This instruction doesn't look correct it
> should be 0x13?
> [    0.000000] Print  old end
> [    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> [    0.000000] Print patch start
> [    0.000000] riscv_alternative_print_inst instruction: 0xff010113
> [    0.000000] riscv_alternative_print_inst instruction: 0x813023
> [    0.000000] riscv_alternative_print_inst instruction: 0x113423
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010413
> [    0.000000] riscv_alternative_print_inst instruction: 0x78713
> [    0.000000] riscv_alternative_print_inst instruction: 0x78693
> [    0.000000] riscv_alternative_print_inst instruction: 0x88613
> [    0.000000] riscv_alternative_print_inst instruction: 0x80593
> [    0.000000] riscv_alternative_print_inst instruction: 0xe0513
> [    0.000000] riscv_alternative_print_inst instruction: 0x97
> [    0.000000] riscv_alternative_print_inst instruction: 0xc82080e7
> ====================> This instruction doesn't look correct comparing
> to objdump output this should be 000080e7 or does it require the
> offset too?
> [    0.000000] riscv_alternative_print_inst instruction: 0x813083
> [    0.000000] riscv_alternative_print_inst instruction: 0x13403
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010113
> [    0.000000] Print patch end
> [    0.000000] stage: 0 -> ffffffff80a24950--> 1 1 0
> [    0.000000] old:arch_sync_dma_for_cpu alt:riscv_noncoherent_supported len:38
> [    0.000000] Print old start
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x97
> ====================> This instruction doesn't look correct it should
> be 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0xeee080e7
>       ====================> This instruction doesn't look correct it
> should be 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] Print old end
> [    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> [    0.000000] Print patch start
> [    0.000000] riscv_alternative_print_inst instruction: 0xff010113
> [    0.000000] riscv_alternative_print_inst instruction: 0x813023
> [    0.000000] riscv_alternative_print_inst instruction: 0x113423
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010413
> [    0.000000] riscv_alternative_print_inst instruction: 0xf0713
> [    0.000000] riscv_alternative_print_inst instruction: 0x80693
> [    0.000000] riscv_alternative_print_inst instruction: 0x88613
> [    0.000000] riscv_alternative_print_inst instruction: 0x78593
> [    0.000000] riscv_alternative_print_inst instruction: 0xe0513
> [    0.000000] riscv_alternative_print_inst instruction: 0x97
> [    0.000000] riscv_alternative_print_inst instruction: 0xc4a080e7
> ====================> This instruction doesn't look correct comparing
> to objdump output this should be 000080e7 or does it require the
> offset too?
> [    0.000000] riscv_alternative_print_inst instruction: 0x813083
> [    0.000000] riscv_alternative_print_inst instruction: 0x13403
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010113
> [    0.000000] Print patch end
> [    0.000000] stage: 0 -> ffffffff80a24974--> 1 1 0
> [    0.000000] old:arch_dma_prep_coherent alt:riscv_noncoherent_supported len:38
> [    0.000000] Print old start
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x970013
> ====================> This instruction doesn't look correct it should
> be 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x80e70000
> ====================> This instruction doesn't look correct it should
> be 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0xe720
> ====================> This instruction doesn't look correct it should
> be 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] riscv_alternative_print_inst instruction: 0x13
> [    0.000000] Print old end
> [    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
> [    0.000000] Print patch start
> [    0.000000] riscv_alternative_print_inst instruction: 0xff010113
> [    0.000000] riscv_alternative_print_inst instruction: 0x813023
> [    0.000000] riscv_alternative_print_inst instruction: 0x113423
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010413
> [    0.000000] riscv_alternative_print_inst instruction: 0xf0713
> [    0.000000] riscv_alternative_print_inst instruction: 0xe8693
> [    0.000000] riscv_alternative_print_inst instruction: 0x88613
> [    0.000000] riscv_alternative_print_inst instruction: 0x78593
> [    0.000000] riscv_alternative_print_inst instruction: 0x30513
> [    0.000000] riscv_alternative_print_inst instruction: 0x97
> [    0.000000] riscv_alternative_print_inst instruction: 0xc12080e7
> ====================> This instruction doesn't look correct comparing
> to objdump output this should be 000080e7 + offset?
> [    0.000000] riscv_alternative_print_inst instruction: 0x813083
> [    0.000000] riscv_alternative_print_inst instruction: 0x13403
> [    0.000000] riscv_alternative_print_inst instruction: 0x1010113
> [    0.000000] Print patch end
>
> Here is the output from objdump of the file (dma-noncoherent.o):
>
> 000000000000032e <.L888^B1>:
>  32e:    ff010113              addi    sp,sp,-16
>  332:    00813023              sd    s0,0(sp)
>  336:    00113423              sd    ra,8(sp)
>  33a:    01010413              addi    s0,sp,16
>  33e:    000f0713              mv    a4,t5
>  342:    00078693              mv    a3,a5
>  346:    00088613              mv    a2,a7
>  34a:    00080593              mv    a1,a6
>  34e:    000e0513              mv    a0,t3
>  352:    00000097              auipc    ra,0x0
>  356:    000080e7              jalr    ra # 352 <.L888^B1+0x24>
>  35a:    00813083              ld    ra,8(sp)
>  35e:    00013403              ld    s0,0(sp)
>  362:    01010113              addi    sp,sp,16
>
> 0000000000000366 <.L888^B2>:
>  366:    ff010113              addi    sp,sp,-16
>  36a:    00813023              sd    s0,0(sp)
>  36e:    00113423              sd    ra,8(sp)
>  372:    01010413              addi    s0,sp,16
>  376:    00078713              mv    a4,a5
>  37a:    00078693              mv    a3,a5
>  37e:    00088613              mv    a2,a7
>  382:    00080593              mv    a1,a6
>  386:    000e0513              mv    a0,t3
>  38a:    00000097              auipc    ra,0x0
>  38e:    000080e7              jalr    ra # 38a <.L888^B2+0x24>
>  392:    00813083              ld    ra,8(sp)
>  396:    00013403              ld    s0,0(sp)
>  39a:    01010113              addi    sp,sp,16
>
> 000000000000039e <.L888^B3>:
>  39e:    ff010113              addi    sp,sp,-16
>  3a2:    00813023              sd    s0,0(sp)
>  3a6:    00113423              sd    ra,8(sp)
>  3aa:    01010413              addi    s0,sp,16
>  3ae:    000f0713              mv    a4,t5
>  3b2:    00080693              mv    a3,a6
>  3b6:    00088613              mv    a2,a7
>  3ba:    00078593              mv    a1,a5
>  3be:    000e0513              mv    a0,t3
>  3c2:    00000097              auipc    ra,0x0
>  3c6:    000080e7              jalr    ra # 3c2 <.L888^B3+0x24>
>  3ca:    00813083              ld    ra,8(sp)
>  3ce:    00013403              ld    s0,0(sp)
>  3d2:    01010113              addi    sp,sp,16
>
> 00000000000003d6 <.L888^B4>:
>  3d6:    ff010113              addi    sp,sp,-16
>  3da:    00813023              sd    s0,0(sp)
>  3de:    00113423              sd    ra,8(sp)
>  3e2:    01010413              addi    s0,sp,16
>  3e6:    000f0713              mv    a4,t5
>  3ea:    000e8693              mv    a3,t4
>  3ee:    00088613              mv    a2,a7
>  3f2:    00078593              mv    a1,a5
>  3f6:    00030513              mv    a0,t1
>  3fa:    00000097              auipc    ra,0x0
>  3fe:    000080e7              jalr    ra # 3fa <.L888^B4+0x24>
>  402:    00813083              ld    ra,8(sp)
>  406:    00013403              ld    s0,0(sp)
>  40a:    01010113              addi    sp,sp,16
>
> Disassembly of section __ksymtab_strings:
>
> Any pointers what could be happening?
>

Some more information,

- If I drop the riscv_alternative_fix_auipc_jalr() call after
patch_text_nosync() and then print the alt->old_ptr instructions
before patching I can see the instructions as 0x13 (nop) which is
correct.

- if I call riscv_alternative_fix_auipc_jalr() call after
patch_text_nosync() and then print the alt->old_ptr instructions
before patching I dont see 0x13 (nop) consistently for old
instructions.

- If I replace the nop's in the old instructions with my assembly code
of rz/five cmo and then just use patch_text_nosync() I can see the
correct actual instruction being printed apart from jalr (is some sort
of offset added to it as I see last 4 bits match?) and then is
replaced correctly by the same alt instructions apart from the jalr
(log [0]).

- If I replace the nop's in the old instructions with my assembly code
of rz/five cmo and then use patch_text_nosync() and
riscv_alternative_fix_auipc_jalr() I can see the actual old
instructions differs a bit and again the jalr instruction differs too
in the patched code (log [1]).

[0] https://paste.debian.net/1261412/
[1] https://paste.debian.net/1261413/

Attached is the objump of dma-noncoherent.o for reference.

Note, if I replace the old/orignal instruction to my asm code for
rz/five cmo and replace the errata id's to deadbeef the code works OK.

Cheers,
Prabhakar
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dma-noncoherent.objdump
Type: application/octet-stream
Size: 23236 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20221121/6aad9187/attachment.obj>


More information about the linux-riscv mailing list