[PATCH 5/7] RISC-V: fix auipc-jalr addresses in patched alternatives

Lad, Prabhakar prabhakar.csengg at gmail.com
Mon Nov 21 07:06:02 PST 2022


Hi Heiko,

On Mon, Nov 21, 2022 at 11:27 AM Heiko Stübner <heiko at sntech.de> wrote:
>
> Hi,
>
> Am Montag, 21. November 2022, 10:50:09 CET schrieb Lad, Prabhakar:
> > On Thu, Nov 10, 2022 at 4:50 PM Heiko Stuebner <heiko at sntech.de> wrote:
> > >
> > > From: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > >
> > > Alternatives live in a different section, so addresses used by call
> > > functions will point to wrong locations after the patch got applied.
> > >
> > > Similar to arm64, adjust the location to consider that offset.
> > >
> > > Signed-off-by: Heiko Stuebner <heiko.stuebner at vrull.eu>
> > > ---
>
> [...]
>
> > I have the below assembly code which I have tested without the
> > alternatives for the RZ/Five CMO,
> >
> > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops)        \
> > asm volatile(".option push\n\t\n\t"                    \
> >          ".option norvc\n\t"                    \
> >          ".option norelax\n\t"                    \
> >          "addi sp,sp,-16\n\t"                    \
> >          "sd    s0,0(sp)\n\t"                    \
> >          "sd    ra,8(sp)\n\t"                    \
> >          "addi    s0,sp,16\n\t"                    \
> >          "mv a4,%6\n\t"                        \
> >          "mv a3,%5\n\t"                        \
> >          "mv a2,%4\n\t"                        \
> >          "mv a1,%3\n\t"                        \
> >          "mv a0,%0\n\t"                        \
> >          "call rzfive_cmo\n\t"                    \
> >          "ld    ra,8(sp)\n\t"                    \
> >          "ld    s0,0(sp)\n\t"                    \
> >          "addi    sp,sp,16\n\t"                    \
> >          ".option pop\n\t"                        \
> >          : : "r"(_cachesize),                    \
> >          "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),    \
> >          "r"((unsigned long)(_start) + (_size)),            \
> >          "r"((unsigned long) (_start)),                \
> >          "r"((unsigned long) (_size)),                \
> >          "r"((unsigned long) (_dir)),                \
> >          "r"((unsigned long) (_ops))                \
> >          : "a0", "a1", "a2", "a3", "a4", "memory")
> >
> > Now when integrate this with ALTERNATIVE_2() as below,
> >
> > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops)        \
> > asm volatile(ALTERNATIVE_2(                        \
> >     __nops(14),                            \
> >     "mv a0, %1\n\t"                            \
> >     "j 2f\n\t"                            \
> >     "3:\n\t"                            \
> >     "cbo." __stringify(_op) " (a0)\n\t"                \
> >     "add a0, a0, %0\n\t"                        \
> >     "2:\n\t"                            \
> >     "bltu a0, %2, 3b\n\t"                        \
> >     __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM,    \
> >     ".option push\n\t\n\t"                        \
> >     ".option norvc\n\t"                        \
> >     ".option norelax\n\t"                        \
> >     "addi sp,sp,-16\n\t"                        \
> >     "sd    s0,0(sp)\n\t"                        \
> >     "sd    ra,8(sp)\n\t"                        \
> >     "addi    s0,sp,16\n\t"                        \
> >     "mv a4,%6\n\t"                            \
> >     "mv a3,%5\n\t"                            \
> >     "mv a2,%4\n\t"                            \
> >     "mv a1,%3\n\t"                            \
> >     "mv a0,%0\n\t"                            \
> >     "call rzfive_cmo\n\t"                \
> >     "ld    ra,8(sp)\n\t"                        \
> >     "ld    s0,0(sp)\n\t"                        \
> >     "addi    sp,sp,16\n\t"                        \
> >     ".option pop\n\t"                        \
> >     , ANDESTECH_VENDOR_ID,                        \
> >             ERRATA_ANDESTECH_NO_IOCP, CONFIG_ERRATA_RZFIVE_CMO)    \
> >     : : "r"(_cachesize),                        \
> >     "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),    \
> >     "r"((unsigned long)(_start) + (_size)),            \
> >     "r"((unsigned long) (_start)),                \
> >     "r"((unsigned long) (_size)),                \
> >     "r"((unsigned long) (_dir)),                \
> >     "r"((unsigned long) (_ops))                \
> >     : "a0", "a1", "a2", "a3", "a4", "memory")
> >
> > I am seeing kernel panic with this change. Looking at the
> > riscv_alternative_fix_auipc_jalr() implementation it assumes the rest
> > of the alternative options are calls too. Is my understanding correct
> > here?
>
> The loop walks through the instructions after the location got patched and
> checks if an instruction is an auipc and the next one is a jalr and only then
> adjusts the address accordingly.
>
Ok so my understanding was wrong here.

> So it _should_ leave all other (non auipc+jalr) instructions alone.
> (hopefully)
>
Agreed.

>
> > Do you think this is the correct approach in my case?
>
> It does look correct on first glance.
>
\o/

> As I had that passing thought, are you actually calling
>         riscv_alternative_fix_auipc_jalr()
> from your errata/.../foo.c after doing the patching?
>
> I.e. with the current patchset, that function is only called from the
> cpufeature part, but for example not from the other patching locations.
> [and a future revision should probably change that :-) ]
>
>
I have made a local copy of riscv_alternative_fix_auipc_jalr() and
then calling it after patch_text_nosync() referring to your patch for
str functions.

> After making sure that function actually runs, the next thing you could try
> is to have both the "original" code and the patch be identical, i.e.
> replace the cbo* part with your code as well and then just output the
> instructions via printk to check what the addresses do in both.
>
> After riscv_alternative_fix_auipc_jalr() ran then both code variants
> should be identical when using the same code in both areas.
>
So I have added debug prints to match the instructions as below after
and before patching:

static void riscv_alternative_print_inst(unsigned int *alt_ptr,
                     unsigned int len)
{
    int num_instr = len / sizeof(u32);
    int i;

    for (i = 0; i < num_instr; i++)
        pr_err("%s instruction: 0x%x\n", __func__, *(alt_ptr + i));

}

void __init_or_module andes_errata_patch_func(struct alt_entry *begin,
struct alt_entry *end,
                          unsigned long archid, unsigned long impid,
                          unsigned int stage)
{
....
    if (cpu_req_errata & tmp) {
        pr_err("stage: %x -> %px--> %x %x %x\n", stage, alt, tmp,
cpu_req_errata, alt->errata_id);
        pr_err("old:%ps alt:%ps len:%lx\n", alt->old_ptr,
alt->alt_ptr, alt->alt_len);
        pr_err("Print old start\n");
        riscv_alternative_print_inst(alt->old_ptr, alt->alt_len);
        pr_err("Print old end\n");
        patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);

        riscv_alternative_fix_auipc_jalr(alt->old_ptr, alt->alt_len,
                        alt->old_ptr - alt->alt_ptr);
        pr_err("Print patch start\n");
        riscv_alternative_print_inst(alt->alt_ptr, alt->alt_len);
        pr_err("Print patch end\n");
    }
.....
}

Below is the log:
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] Print new old end
[    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
[    0.000000] Print patch start
[    0.000000] riscv_alternative_print_inst instruction: 0xff010113
[    0.000000] riscv_alternative_print_inst instruction: 0x813023
[    0.000000] riscv_alternative_print_inst instruction: 0x113423
[    0.000000] riscv_alternative_print_inst instruction: 0x1010413
[    0.000000] riscv_alternative_print_inst instruction: 0xf0713
[    0.000000] riscv_alternative_print_inst instruction: 0x78693
[    0.000000] riscv_alternative_print_inst instruction: 0x88613
[    0.000000] riscv_alternative_print_inst instruction: 0x80593
[    0.000000] riscv_alternative_print_inst instruction: 0xe0513
[    0.000000] riscv_alternative_print_inst instruction: 0x97
[    0.000000] riscv_alternative_print_inst instruction: 0xcba080e7
[    0.000000] riscv_alternative_print_inst instruction: 0x813083
[    0.000000] riscv_alternative_print_inst instruction: 0x13403
[    0.000000] riscv_alternative_print_inst instruction: 0x1010113
[    0.000000] Print patch end
[    0.000000] stage: 0 -> ffffffff80a2492c--> 1 1 0
[    0.000000] old:arch_sync_dma_for_device
alt:riscv_noncoherent_supported len:38
[    0.000000] Print  old start
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x970013
    ====================> This instruction doesn't look correct it
should be 0x13?
[    0.000000] Print  old end
[    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
[    0.000000] Print patch start
[    0.000000] riscv_alternative_print_inst instruction: 0xff010113
[    0.000000] riscv_alternative_print_inst instruction: 0x813023
[    0.000000] riscv_alternative_print_inst instruction: 0x113423
[    0.000000] riscv_alternative_print_inst instruction: 0x1010413
[    0.000000] riscv_alternative_print_inst instruction: 0x78713
[    0.000000] riscv_alternative_print_inst instruction: 0x78693
[    0.000000] riscv_alternative_print_inst instruction: 0x88613
[    0.000000] riscv_alternative_print_inst instruction: 0x80593
[    0.000000] riscv_alternative_print_inst instruction: 0xe0513
[    0.000000] riscv_alternative_print_inst instruction: 0x97
[    0.000000] riscv_alternative_print_inst instruction: 0xc82080e7
====================> This instruction doesn't look correct comparing
to objdump output this should be 000080e7 or does it require the
offset too?
[    0.000000] riscv_alternative_print_inst instruction: 0x813083
[    0.000000] riscv_alternative_print_inst instruction: 0x13403
[    0.000000] riscv_alternative_print_inst instruction: 0x1010113
[    0.000000] Print patch end
[    0.000000] stage: 0 -> ffffffff80a24950--> 1 1 0
[    0.000000] old:arch_sync_dma_for_cpu alt:riscv_noncoherent_supported len:38
[    0.000000] Print old start
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x97
====================> This instruction doesn't look correct it should
be 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0xeee080e7
      ====================> This instruction doesn't look correct it
should be 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] Print old end
[    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
[    0.000000] Print patch start
[    0.000000] riscv_alternative_print_inst instruction: 0xff010113
[    0.000000] riscv_alternative_print_inst instruction: 0x813023
[    0.000000] riscv_alternative_print_inst instruction: 0x113423
[    0.000000] riscv_alternative_print_inst instruction: 0x1010413
[    0.000000] riscv_alternative_print_inst instruction: 0xf0713
[    0.000000] riscv_alternative_print_inst instruction: 0x80693
[    0.000000] riscv_alternative_print_inst instruction: 0x88613
[    0.000000] riscv_alternative_print_inst instruction: 0x78593
[    0.000000] riscv_alternative_print_inst instruction: 0xe0513
[    0.000000] riscv_alternative_print_inst instruction: 0x97
[    0.000000] riscv_alternative_print_inst instruction: 0xc4a080e7
====================> This instruction doesn't look correct comparing
to objdump output this should be 000080e7 or does it require the
offset too?
[    0.000000] riscv_alternative_print_inst instruction: 0x813083
[    0.000000] riscv_alternative_print_inst instruction: 0x13403
[    0.000000] riscv_alternative_print_inst instruction: 0x1010113
[    0.000000] Print patch end
[    0.000000] stage: 0 -> ffffffff80a24974--> 1 1 0
[    0.000000] old:arch_dma_prep_coherent alt:riscv_noncoherent_supported len:38
[    0.000000] Print old start
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x970013
====================> This instruction doesn't look correct it should
be 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x80e70000
====================> This instruction doesn't look correct it should
be 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0xe720
====================> This instruction doesn't look correct it should
be 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] riscv_alternative_print_inst instruction: 0x13
[    0.000000] Print old end
[    0.000000] riscv_alternative_fix_auipc_jalr num instruction: 14
[    0.000000] Print patch start
[    0.000000] riscv_alternative_print_inst instruction: 0xff010113
[    0.000000] riscv_alternative_print_inst instruction: 0x813023
[    0.000000] riscv_alternative_print_inst instruction: 0x113423
[    0.000000] riscv_alternative_print_inst instruction: 0x1010413
[    0.000000] riscv_alternative_print_inst instruction: 0xf0713
[    0.000000] riscv_alternative_print_inst instruction: 0xe8693
[    0.000000] riscv_alternative_print_inst instruction: 0x88613
[    0.000000] riscv_alternative_print_inst instruction: 0x78593
[    0.000000] riscv_alternative_print_inst instruction: 0x30513
[    0.000000] riscv_alternative_print_inst instruction: 0x97
[    0.000000] riscv_alternative_print_inst instruction: 0xc12080e7
====================> This instruction doesn't look correct comparing
to objdump output this should be 000080e7 + offset?
[    0.000000] riscv_alternative_print_inst instruction: 0x813083
[    0.000000] riscv_alternative_print_inst instruction: 0x13403
[    0.000000] riscv_alternative_print_inst instruction: 0x1010113
[    0.000000] Print patch end

Here is the output from objdump of the file (dma-noncoherent.o):

000000000000032e <.L888^B1>:
 32e:    ff010113              addi    sp,sp,-16
 332:    00813023              sd    s0,0(sp)
 336:    00113423              sd    ra,8(sp)
 33a:    01010413              addi    s0,sp,16
 33e:    000f0713              mv    a4,t5
 342:    00078693              mv    a3,a5
 346:    00088613              mv    a2,a7
 34a:    00080593              mv    a1,a6
 34e:    000e0513              mv    a0,t3
 352:    00000097              auipc    ra,0x0
 356:    000080e7              jalr    ra # 352 <.L888^B1+0x24>
 35a:    00813083              ld    ra,8(sp)
 35e:    00013403              ld    s0,0(sp)
 362:    01010113              addi    sp,sp,16

0000000000000366 <.L888^B2>:
 366:    ff010113              addi    sp,sp,-16
 36a:    00813023              sd    s0,0(sp)
 36e:    00113423              sd    ra,8(sp)
 372:    01010413              addi    s0,sp,16
 376:    00078713              mv    a4,a5
 37a:    00078693              mv    a3,a5
 37e:    00088613              mv    a2,a7
 382:    00080593              mv    a1,a6
 386:    000e0513              mv    a0,t3
 38a:    00000097              auipc    ra,0x0
 38e:    000080e7              jalr    ra # 38a <.L888^B2+0x24>
 392:    00813083              ld    ra,8(sp)
 396:    00013403              ld    s0,0(sp)
 39a:    01010113              addi    sp,sp,16

000000000000039e <.L888^B3>:
 39e:    ff010113              addi    sp,sp,-16
 3a2:    00813023              sd    s0,0(sp)
 3a6:    00113423              sd    ra,8(sp)
 3aa:    01010413              addi    s0,sp,16
 3ae:    000f0713              mv    a4,t5
 3b2:    00080693              mv    a3,a6
 3b6:    00088613              mv    a2,a7
 3ba:    00078593              mv    a1,a5
 3be:    000e0513              mv    a0,t3
 3c2:    00000097              auipc    ra,0x0
 3c6:    000080e7              jalr    ra # 3c2 <.L888^B3+0x24>
 3ca:    00813083              ld    ra,8(sp)
 3ce:    00013403              ld    s0,0(sp)
 3d2:    01010113              addi    sp,sp,16

00000000000003d6 <.L888^B4>:
 3d6:    ff010113              addi    sp,sp,-16
 3da:    00813023              sd    s0,0(sp)
 3de:    00113423              sd    ra,8(sp)
 3e2:    01010413              addi    s0,sp,16
 3e6:    000f0713              mv    a4,t5
 3ea:    000e8693              mv    a3,t4
 3ee:    00088613              mv    a2,a7
 3f2:    00078593              mv    a1,a5
 3f6:    00030513              mv    a0,t1
 3fa:    00000097              auipc    ra,0x0
 3fe:    000080e7              jalr    ra # 3fa <.L888^B4+0x24>
 402:    00813083              ld    ra,8(sp)
 406:    00013403              ld    s0,0(sp)
 40a:    01010113              addi    sp,sp,16

Disassembly of section __ksymtab_strings:

Any pointers what could be happening?

>
> > Note, I wanted to test with ALTERNATIVE_2() first to make sure
> > everything is okay and then later test my ALTERNATIVE_3()
> > implementation.
>
> sounds like a very sensible idea to use the existing macros
> first for verification :-)
>
:)

Cheers,
Prabhakar



More information about the linux-riscv mailing list