[PATCH v2 2/6] RISC-V: Minimal parser for "riscv, isa" strings

Atish Patra atishp at atishpatra.org
Mon Feb 14 23:36:31 PST 2022


On Mon, Feb 14, 2022 at 7:27 PM Tsukasa OI <research_trasio at irq.a4lg.com> wrote:
>
> On 2022/02/15 5:07, Atish Patra wrote:
> > On Fri, Feb 11, 2022 at 10:25 PM Tsukasa OI
> > <research_trasio at irq.a4lg.com> wrote:
> >>
> >> Hi Atish,
> >>
> >> Your patches for new RISC-V extension framework seems good.  I was busy
> >> working on GNU Binutils and (though I'm not completely convinced) it's
> >> okay with "isa-ext" approach of Anup.
> >>
> >> And thanks for using my parser for new ISA extension framework.
> >>
> >> Still, I will raise an issue with your "improvement" and propose new
> >> version of PATCH 1-3/6 (as replies to this e-mail).
> >>
> >> (I also corrected indent on PATCH 3/6 though this is not functional).
> >>
> >> Thanks,
> >> Tsukasa
> >>
> >> On 2022/02/11 6:40, Atish Patra wrote:
> >>> From: Tsukasa OI <research_trasio at irq.a4lg.com>
> >>>
> >>> Current hart ISA ("riscv,isa") parser don't correctly parse:
> >>>
> >>> 1. Multi-letter extensions
> >>> 2. Version numbers
> >>>
> >>> All ISA extensions ratified recently has multi-letter extensions
> >>> (except 'H'). The current "riscv,isa" parser that is easily confused
> >>> by multi-letter extensions and "p" in version numbers can be a huge
> >>> problem for adding new extensions through the device tree.
> >>>
> >>> Leaving it would create incompatible hacks and would make "riscv,isa"
> >>> value unreliable.
> >>>
> >>> This commit implements minimal parser for "riscv,isa" strings.  With this,
> >>> we can safely ignore multi-letter extensions and version numbers.
> >>>
> >>> Signed-off-by: Tsukasa OI <research_trasio at irq.a4lg.com>
> >>> [Improved commit text and fixed a bug around 's' in base extension]
> >>> Signed-off-by: Atish Patra <atishp at rivosinc.com>
> >>> ---
> >>>  arch/riscv/kernel/cpufeature.c | 67 ++++++++++++++++++++++++++++------
> >>>  1 file changed, 56 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> >>> index dd3d57eb4eea..e19ae4391a9b 100644
> >>> --- a/arch/riscv/kernel/cpufeature.c
> >>> +++ b/arch/riscv/kernel/cpufeature.c
> >>> @@ -7,6 +7,7 @@
> >>>   */
> >>>
> >>>  #include <linux/bitmap.h>
> >>> +#include <linux/ctype.h>
> >>>  #include <linux/of.h>
> >>>  #include <asm/processor.h>
> >>>  #include <asm/hwcap.h>
> >>> @@ -66,7 +67,7 @@ void __init riscv_fill_hwcap(void)
> >>>       struct device_node *node;
> >>>       const char *isa;
> >>>       char print_str[NUM_ALPHA_EXTS + 1];
> >>> -     size_t i, j, isa_len;
> >>> +     int i, j;
> >>>       static unsigned long isa2hwcap[256] = {0};
> >>>
> >>>       isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I;
> >>> @@ -92,23 +93,67 @@ void __init riscv_fill_hwcap(void)
> >>>                       continue;
> >>>               }
> >>>
> >>> -             i = 0;
> >>> -             isa_len = strlen(isa);
> >>>  #if IS_ENABLED(CONFIG_32BIT)
> >>>               if (!strncmp(isa, "rv32", 4))
> >>> -                     i += 4;
> >>> +                     isa += 4;
> >>>  #elif IS_ENABLED(CONFIG_64BIT)
> >>>               if (!strncmp(isa, "rv64", 4))
> >>> -                     i += 4;
> >>> +                     isa += 4;
> >>>  #endif
> >>> -             for (; i < isa_len; ++i) {
> >>> -                     this_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
> >>> +             for (; *isa; ++isa) {
> >>> +                     const char *ext = isa++;
> >>> +                     bool ext_long, ext_err = false;
> >>
> >> To address an issue reported in
> >> <https://lore.kernel.org/all/202202111616.z7nZoKYj-lkp@intel.com/T/>,
> >> we'd be better to initialize ext_long with false.
> >>
> >>> +
> >>> +                     switch (*ext) {
> >>> +                     case 's':
> >>> +                     case 'x':
> >>> +                     case 'z':
> >>> +                             /**
> >>> +                              * 's' is a special case because:
> >>> +                              * It can be present in base extension for supervisor
> >>> +                              * Multi-letter extensions can start with 's' as well for
> >>> +                              * Supervisor extensions (i.e. sstc, sscofpmf, svinval)
> >>> +                              */
> >>> +                             if (*ext == 's' && ext[-1] != '_')
> >>> +                                     break;
> >>
> >> I think you added this for QEMU bug workaround (I will describe leter)
> >> but this will raise an issue.
> >>
> >> QEMU generates invalid ISA string on "riscv,isa" such like:
> >> "rv64imafdcsuh".  QEMU handles 'S' as a single-letter extension and can
> >> easily confuse my parser (handles this as 'Suh' non-existent extension).
> >>
> >> Atish tried to address this issue by handling \2 of "([^_])(S)"
> >> as a single-letter 'S' extension.
> >>
> >> It's okay for QEMU-generated ones and full ISA strings like
> >> "rv64imafdc_svinval_svnapot".  However, it causes problems on
> >> "rv64imafdcsvinval_svnapot" (no underscore before "svinval").  This is
> >> still a valid ISA string (with existing extensions) but confuses
> >> Atish's parser.
> >>
> >> The best solution is to fix **ONLY** QEMU and not make this exception.
> >>
> >
> > "S" is a bit ambiguous as misa defines the "S" bit as Supervisor
> > implemented but it's not a valid ISA extension.
> >
> > Probably, we should fix the isa-string generation in Qemu. But we
> > should have a workaround for backward compatibility
> > as well so that users can run hypervisors on older qemu.
>
> Okay.
>
> I sent a patch to qemu-riscv and (before sending the patch), I
> surprised by the fact that I'm the third person to try to fix this bug.
>
> <https://lists.nongnu.org/archive/html/qemu-riscv/2019-08/msg00165.html>
> <https://lists.nongnu.org/archive/html/qemu-riscv/2021-04/msg00248.html>
> <https://lists.nongnu.org/archive/html/qemu-riscv/2022-02/msg00098.html>
>

Wow. At least I did not become the fourth person ;)
I was about to send the patch to qemu mailing list when I saw this email.

A side note, I usually send qemu patches to both qemu-riscv & qemu-devel
for a wider review audience. I noticed that you sent the patch only to
qemu-riscv.

>
> >
> >> If we need QEMU workaround now, how about checking \2 of "([^_])(SU)"
> >> as single-letter 'S' and 'U' extensions?
> >>
> >
> > Sounds good to me. Can I fold in your changes and send a v3 so that it
> > is easier to merge ?
>
> Of course!  That's exactly what I meant.
>
>
> >
> >> Here's the background for my new workaround:
> >>
> >> 1.  Linux requires U-mode.
> >> 2.  QEMU always generates invalid ISA string containing "SU"
> >>     for RISC-V with MSU modes.
> >> 3.  There's no standard extensions starting with 'Su'
> >>     (there's even no discussions/drafts about them).
> >>
> >> Following is related dumps from QEMU and Spike with my test Busybox image
> >> (with some notes):
> >>
> >> QEMU + RV64GC (with hypervisor enabled) w/o Atish's workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdcsuh
> >> isa-ext         :
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as an extension)
> >> dmesg:
> >> [    0.000000] riscv: ISA extensions acdfim  (hypervisor not detected!)
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >> QEMU + RV64GC (with hypervisor enabled) w/ Atish's workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdcsuh
> >> isa-ext         :
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as 'S', 'U' and 'H')
> >> dmesg:
> >> [    0.000000] riscv: ISA extensions acdfhimsu (extra 'S' and 'U' are harmless)
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >> QEMU + RV64GC (with hypervisor enabled) w/ Tsukasa's new workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdcsuh
> >> isa-ext         :
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as 'S', 'U' and 'H')
> >> dmesg:
> >> [    0.000000] riscv: ISA extensions acdfhimsu (same as Atish's workaround)
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >> Spike + rv64imafdc_svinval_svnapot w/ Atish's workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdc
> >> isa-ext         : svinval svnapot
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdc_svinval_svnapot
> >> dmesg:
> >> [    0.000000] Found ISA extension svinval
> >> [    0.000000] Found ISA extension svnapot
> >> [    0.000000] riscv: ISA extensions acdfim
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >> Spike + rv64imafdcsvinval_svnapot w/ Atish's workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdcsvinval
> >> isa-ext         : svnapot
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdcsvinval_svnapot
> >> dmesg:
> >> [    0.000000] Found ISA extension svnapot        (no 'Svinval' is found!)
> >> [    0.000000] riscv: ISA extensions acdfilmnsv   ('V' et al. should not be here!)
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >> Spike + rv64imafdcsvinval_svnapot w/ Tsukasa's new workaround
> >> =-===========================================================================-=
> >> processor       : 0
> >> hart            : 0
> >> isa             : rv64imafdcsvinval
> >> isa-ext         : svinval svnapot
> >> mmu             : sv48
> >>
> >> riscv,isa (DTS) : rv64imafdcsvinval_svnapot
> >> dmesg:
> >> [    0.000000] Found ISA extension svinval
> >> [    0.000000] Found ISA extension svnapot
> >> [    0.000000] riscv: ISA extensions acdfim
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> =-===========================================================================-=
> >>
> >>
> >>> +                             ext_long = true;
> >>> +                             /* Multi-letter extension must be delimited */
> >>> +                             for (; *isa && *isa != '_'; ++isa)
> >>> +                                     if (!islower(*isa) && !isdigit(*isa))
> >>> +                                             ext_err = true;
> >>> +                             /* ... but must be ignored. */
> >>> +                             break;
> >>> +                     default:
> >>
> >>> +                             ext_long = false;
> >>
> >> A line above is removed on my new patches.
> >>
> >>> +                             if (!islower(*ext)) {
> >>> +                                     ext_err = true;
> >>> +                                     break;
> >>> +                             }
> >>> +                             /* Find next extension */
> >>> +                             if (!isdigit(*isa))
> >>> +                                     break;
> >>> +                             while (isdigit(*++isa))
> >>> +                                     ;
> >>> +                             if (*isa != 'p')
> >>> +                                     break;
> >>> +                             if (!isdigit(*++isa)) {
> >>> +                                     --isa;
> >>> +                                     break;
> >>> +                             }
> >>> +                             while (isdigit(*++isa))
> >>> +                                     ;
> >>> +                             break;
> >>> +                     }
> >>> +                     if (*isa != '_')
> >>> +                             --isa;
> >>>                       /*
> >>> -                      * TODO: X, Y and Z extension parsing for Host ISA
> >>> -                      * bitmap will be added in-future.
> >>> +                      * TODO: Full version-aware handling including
> >>> +                      * multi-letter extensions will be added in-future.
> >>>                        */
> >>> -                     if ('a' <= isa[i] && isa[i] < 'x')
> >>> -                             this_isa |= (1UL << (isa[i] - 'a'));
> >>> +                     if (ext_err || ext_long)
> >>> +                             continue;
> >>> +                     this_hwcap |= isa2hwcap[(unsigned char)(*ext)];
> >>> +                     this_isa |= (1UL << (*ext - 'a'));
> >>>               }
> >>>
> >>>               /*
> >
> >
> >



-- 
Regards,
Atish



More information about the linux-riscv mailing list