[PATCH v2 2/6] RISC-V: Minimal parser for "riscv, isa" strings

Tsukasa OI research_trasio at irq.a4lg.com
Fri Feb 11 22:25:48 PST 2022


Hi Atish,

Your patches for new RISC-V extension framework seems good.  I was busy
working on GNU Binutils and (though I'm not completely convinced) it's
okay with "isa-ext" approach of Anup.

And thanks for using my parser for new ISA extension framework.

Still, I will raise an issue with your "improvement" and propose new
version of PATCH 1-3/6 (as replies to this e-mail).

(I also corrected indent on PATCH 3/6 though this is not functional).

Thanks,
Tsukasa

On 2022/02/11 6:40, Atish Patra wrote:
> From: Tsukasa OI <research_trasio at irq.a4lg.com>
> 
> Current hart ISA ("riscv,isa") parser don't correctly parse:
> 
> 1. Multi-letter extensions
> 2. Version numbers
> 
> All ISA extensions ratified recently has multi-letter extensions
> (except 'H'). The current "riscv,isa" parser that is easily confused
> by multi-letter extensions and "p" in version numbers can be a huge
> problem for adding new extensions through the device tree.
> 
> Leaving it would create incompatible hacks and would make "riscv,isa"
> value unreliable.
> 
> This commit implements minimal parser for "riscv,isa" strings.  With this,
> we can safely ignore multi-letter extensions and version numbers.
> 
> Signed-off-by: Tsukasa OI <research_trasio at irq.a4lg.com>
> [Improved commit text and fixed a bug around 's' in base extension]
> Signed-off-by: Atish Patra <atishp at rivosinc.com>
> ---
>  arch/riscv/kernel/cpufeature.c | 67 ++++++++++++++++++++++++++++------
>  1 file changed, 56 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index dd3d57eb4eea..e19ae4391a9b 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include <linux/bitmap.h>
> +#include <linux/ctype.h>
>  #include <linux/of.h>
>  #include <asm/processor.h>
>  #include <asm/hwcap.h>
> @@ -66,7 +67,7 @@ void __init riscv_fill_hwcap(void)
>  	struct device_node *node;
>  	const char *isa;
>  	char print_str[NUM_ALPHA_EXTS + 1];
> -	size_t i, j, isa_len;
> +	int i, j;
>  	static unsigned long isa2hwcap[256] = {0};
>  
>  	isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I;
> @@ -92,23 +93,67 @@ void __init riscv_fill_hwcap(void)
>  			continue;
>  		}
>  
> -		i = 0;
> -		isa_len = strlen(isa);
>  #if IS_ENABLED(CONFIG_32BIT)
>  		if (!strncmp(isa, "rv32", 4))
> -			i += 4;
> +			isa += 4;
>  #elif IS_ENABLED(CONFIG_64BIT)
>  		if (!strncmp(isa, "rv64", 4))
> -			i += 4;
> +			isa += 4;
>  #endif
> -		for (; i < isa_len; ++i) {
> -			this_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
> +		for (; *isa; ++isa) {
> +			const char *ext = isa++;
> +			bool ext_long, ext_err = false;

To address an issue reported in
<https://lore.kernel.org/all/202202111616.z7nZoKYj-lkp@intel.com/T/>,
we'd be better to initialize ext_long with false.

> +
> +			switch (*ext) {
> +			case 's':
> +			case 'x':
> +			case 'z':
> +				/**
> +				 * 's' is a special case because:
> +				 * It can be present in base extension for supervisor
> +				 * Multi-letter extensions can start with 's' as well for
> +				 * Supervisor extensions (i.e. sstc, sscofpmf, svinval)
> +				 */
> +				if (*ext == 's' && ext[-1] != '_')
> +					break;

I think you added this for QEMU bug workaround (I will describe leter)
but this will raise an issue.

QEMU generates invalid ISA string on "riscv,isa" such like:
"rv64imafdcsuh".  QEMU handles 'S' as a single-letter extension and can
easily confuse my parser (handles this as 'Suh' non-existent extension).

Atish tried to address this issue by handling \2 of "([^_])(S)"
as a single-letter 'S' extension.

It's okay for QEMU-generated ones and full ISA strings like
"rv64imafdc_svinval_svnapot".  However, it causes problems on
"rv64imafdcsvinval_svnapot" (no underscore before "svinval").  This is
still a valid ISA string (with existing extensions) but confuses
Atish's parser.

The best solution is to fix **ONLY** QEMU and not make this exception.

If we need QEMU workaround now, how about checking \2 of "([^_])(SU)"
as single-letter 'S' and 'U' extensions?

Here's the background for my new workaround:

1.  Linux requires U-mode.
2.  QEMU always generates invalid ISA string containing "SU"
    for RISC-V with MSU modes.
3.  There's no standard extensions starting with 'Su'
    (there's even no discussions/drafts about them).

Following is related dumps from QEMU and Spike with my test Busybox image
(with some notes):

QEMU + RV64GC (with hypervisor enabled) w/o Atish's workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdcsuh
isa-ext		: 
mmu		: sv48

riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as an extension)
dmesg:
[    0.000000] riscv: ISA extensions acdfim  (hypervisor not detected!)
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=

QEMU + RV64GC (with hypervisor enabled) w/ Atish's workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdcsuh
isa-ext		: 
mmu		: sv48

riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as 'S', 'U' and 'H')
dmesg:
[    0.000000] riscv: ISA extensions acdfhimsu (extra 'S' and 'U' are harmless)
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=

QEMU + RV64GC (with hypervisor enabled) w/ Tsukasa's new workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdcsuh
isa-ext		: 
mmu		: sv48

riscv,isa (DTS) : rv64imafdcsuh   ("suh" is handled as 'S', 'U' and 'H')
dmesg:
[    0.000000] riscv: ISA extensions acdfhimsu (same as Atish's workaround)
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=

Spike + rv64imafdc_svinval_svnapot w/ Atish's workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdc
isa-ext		: svinval svnapot 
mmu		: sv48

riscv,isa (DTS) : rv64imafdc_svinval_svnapot
dmesg:
[    0.000000] Found ISA extension svinval
[    0.000000] Found ISA extension svnapot
[    0.000000] riscv: ISA extensions acdfim
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=

Spike + rv64imafdcsvinval_svnapot w/ Atish's workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdcsvinval
isa-ext		: svnapot 
mmu		: sv48

riscv,isa (DTS) : rv64imafdcsvinval_svnapot
dmesg:
[    0.000000] Found ISA extension svnapot        (no 'Svinval' is found!)
[    0.000000] riscv: ISA extensions acdfilmnsv   ('V' et al. should not be here!)
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=

Spike + rv64imafdcsvinval_svnapot w/ Tsukasa's new workaround
=-===========================================================================-=
processor	: 0
hart		: 0
isa		: rv64imafdcsvinval
isa-ext		: svinval svnapot 
mmu		: sv48

riscv,isa (DTS) : rv64imafdcsvinval_svnapot
dmesg:
[    0.000000] Found ISA extension svinval
[    0.000000] Found ISA extension svnapot
[    0.000000] riscv: ISA extensions acdfim
[    0.000000] riscv: ELF capabilities acdfim
=-===========================================================================-=


> +				ext_long = true;
> +				/* Multi-letter extension must be delimited */
> +				for (; *isa && *isa != '_'; ++isa)
> +					if (!islower(*isa) && !isdigit(*isa))
> +						ext_err = true;
> +				/* ... but must be ignored. */
> +				break;
> +			default:

> +				ext_long = false;

A line above is removed on my new patches.

> +				if (!islower(*ext)) {
> +					ext_err = true;
> +					break;
> +				}
> +				/* Find next extension */
> +				if (!isdigit(*isa))
> +					break;
> +				while (isdigit(*++isa))
> +					;
> +				if (*isa != 'p')
> +					break;
> +				if (!isdigit(*++isa)) {
> +					--isa;
> +					break;
> +				}
> +				while (isdigit(*++isa))
> +					;
> +				break;
> +			}
> +			if (*isa != '_')
> +				--isa;
>  			/*
> -			 * TODO: X, Y and Z extension parsing for Host ISA
> -			 * bitmap will be added in-future.
> +			 * TODO: Full version-aware handling including
> +			 * multi-letter extensions will be added in-future.
>  			 */
> -			if ('a' <= isa[i] && isa[i] < 'x')
> -				this_isa |= (1UL << (isa[i] - 'a'));
> +			if (ext_err || ext_long)
> +				continue;
> +			this_hwcap |= isa2hwcap[(unsigned char)(*ext)];
> +			this_isa |= (1UL << (*ext - 'a'));
>  		}
>  
>  		/*



More information about the linux-riscv mailing list