more /proc/cpuinfo & extension support related woes

Conor Dooley conor.dooley at microchip.com
Wed Jul 26 03:41:05 PDT 2023


On Wed, Jul 26, 2023 at 11:04:39AM +0200, Andrew Jones wrote:
> On Tue, Jul 25, 2023 at 06:19:36PM +0100, Conor Dooley wrote:

> > (I've only CCed a few people here, mostly those who were involved in the
> > riscv_has_extension_likely() stuff earlier this year)
> > 
> > I know using /proc/cpuinfo for extension detection is a bad idea, that's
> > not the point of this. We can go on about how it is bad, but people are
> > using it & we shouldn't knowingly lead them astray. For example, unlord
> > reported on IRC yesterday that, having built a kernel using a released
> > version of llvm, they were surprised to see illegal instruction
> > exceptions given /proc/cpuinfo had a v in the ISA string entry.
> > 
> > For most extensions that we support in the kernel at the moment, the
> > general flow is something along the lines of the DT etc gets parsed,
> > and then we set a bit if the hart supports the extension. For some
> > select extensions, that have additional properties in the DT, we perform
> > a check for those before setting the bit. Extensions that the kernel
> > does not understand get ignored as it does not know what to look for in
> > the isa string representation.
> > 
> > As a result, the kernel populates a list of neither the extensions
> > supported by the hart nor a list of the extensions supported by the
> > running kernel. Instead, its something awkward in the middle - a list
> > of extensions that a kernel built from this source code is aware of
> > available on this platform.
> > 
> > For the case where the system, but not the kernel, supports vector, we
> > do what looks like a weird dance*, where we set up vsize, but then mask
> > off the hwcap bit:
> > 
> > 	if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
> > 		riscv_v_setup_vsize();
> > 		/*
> > 		 * ISA string in device tree might have 'v' flag, but
> > 		 * CONFIG_RISCV_ISA_V is disabled in kernel.
> > 		 * Clear V flag in elf_hwcap if CONFIG_RISCV_ISA_V is disabled.
> > 		 */
> > 		if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
> > 			elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> > 	}
> > 	* riscv_v_setup_vsize() w/o a return check should be a NOP when
> > 	the config option is disabled.
> > 
> > Note that only the hwcap bit is masked, but not the corresponding
> > extension support tracking bit in riscv_isa. I'd swear I recall talking
> > to Andy about how this could be problematic w.r.t. false reporting of
> > some variety, I'll have to go find it. It is (I think) the same for any
> > of the other extensions parsed from DT that has a Kconfig option that
> > must be set before userspace can use it, such as F & D.
> > 
> > Either way, this becomes problematic for /proc/cpuinfo, where we do:
> > 	static void print_isa_ext(struct seq_file *f)
> > 	{
> > 		struct riscv_isa_ext_data *edata;
> > 		int i = 0, arr_sz;
> > 	
> > 		arr_sz = ARRAY_SIZE(isa_ext_arr) - 1;
> > 	
> > 		/* No extension support available */
> > 		if (arr_sz <= 0)
> > 			return;
> > 	
> > 		for (i = 0; i <= arr_sz; i++) {
> > 			edata = &isa_ext_arr[i];
> > 			if (!__riscv_isa_extension_available(NULL, edata->isa_ext_id))
> > 				continue;
> > 			seq_printf(f, "_%s", edata->uprop);
> > 		}
> > 	}
> > 
> > The issue here is the call to __riscv_isa_extension_available() which,
> > with a NULL first argument, will use the riscv_isa bitmask that contains,
> > as described above, a list of extensions that a kernel built from this
> > source code is aware of that are available on this platform.
> > 
> > This also impacts riscv_has_extension_[un]likely(), which operates on a
> > similar premise. has_vector(), has_svnapot & has_fpu(), which are the
> > users of riscv_has_extension_[un]likely() don't suffer here, because when
> > their config option is unset, they resolve to `return -EOPNOTSUPP;` or
> > similar.
> > 
> > Other users of __riscv_isa_extension_available() I have not evaluated
> > yet, but there are quite a few, scattered in various places, including
> > kvm & hwprobe.
> > 
> > I think we might've just overused this particular bit of information for
> > various purposes & screwed ourselves a bit. There are a few main places
> > this information is used:
> > 
> > 1 Figuring what exactly the hardware supports during boot. Evan has
> >   added per-CPU tracking of this, although the kernel is limited here to
> >   the subset that it knows AND further restricted by a call to
> >   riscv_isa_extension_check(). This we don't actually use yet after
> >   boot, but Evan has a user on the lists. This per-CPU tracking is then
> >   used to create and LCD set of extensions that drives
> >   __riscv_isa_extension_available().
> > 
> > 2 Telling userspace what extensions are supported. However, by what? The
> >   running-kernel & hw combo? The hardware? I tried to read the hwprobe
> >   doc and couldn't tell. /proc/cpuinfo, as I already said, is a
> >   mixed-mess, but hwprobe should mean "this extension is supported by
> >   the hardware (or firmware) and you can use this without a concern",
> >   right? Evan's proposed per-CPU reporting here adds a third thing to
> >   the mix, but is intended to (AFAIU) represent the set of
> >   known-to-the-kernel extensions that each hart supports, regardless of
> >   whether the kernel itself supports them.
> > 
> > 3 Setting dynamic code paths in the kernel. For example, if the hardware
> >   supports Zbb, patch in the string manip routines. This is distinct
> >   from telling userspace it can use the extension, since something like
> >   Zbb can be used by userspace even if the kernel doesn't touch it.
> > 
> > 4 Telling in-kernel users what extensions they can use. This, I figure,
> >   is more along the lines of my theorised meaning for hwprobe.
> > 
> > I think for case 3, we are mostly okay. Either this sort of code uses
> > alternatives, that have an in-built config dependency, or the
> > has_whatever() stuff that has an alternative implementation if the
> > correct config option is not set.
> > 
> > For case 2, I think hwprobe is doing the right thing (for now) but
> > /proc/cpuinfo needs to be changed.
> > 
> > Case 4 probably needs an audit to make sure that nothing is going
> > awry. I'll go have a look through the users.
> > 
> > If other people think splitting this up a worthwhile endeavour, I'll
> > go write some patches. I do FWIW. Or maybe I am overlooking something
> > that makes this less of a problem than I think. LMK if so :)

> Here's a summary of my thoughts, which probably aren't adding anything new
> to this
> 
>  - The kernel needs per-hart bitmaps expressing what extensions are
>    supported for each given hart.

We sorta have this. The per-hart bitmap is a bitmap of extensions that
have defines in hwcap.h (an understandable limitation) and for which
riscv_isa_extension_check() passes.

>  - The kernel needs a bitmap expressing the LCD of extensions across all
>    harts.

Again, we sorta have this. It's all of the per-hart bitmaps, bitwise
ANDed.

>  (The two bitmaps are clearly expressing the intersection of what's
>   supported by a hart and what the kernel is aware of, since the kernel
>   must allocate the bits.)

Right. With the exception of the riscv_isa_extension_check(), that we
currently apply to Zicbo{m,z}.

Getting ahead of myself a little, but I wonder if we should remove the
riscv_isa_extension_check() stuff from from the code that sets the
bitmaps & instead have it as a separate stage, which would align our
bitmaps with what you suggest here.

>  - Extensions in DT/ACPI that the kernel is unaware of get ignored by
>    the kernel and not exposed to userspace (if the extensions are for
>    U-mode and U-mode has another way of determining their presence to
>    use them, then good for U-mode)

Sure.

>  - The kernel will combine information from one of the two bitmaps listed
>    above with other information, such as config information, to decide
>    when / if an extension should be used by the kernel and/or exposed to
>    userspace 

Right. This bit (or rather two bits, since I view the kernel and
userspace bits here separately) is where we are falling short. I think
hwprobe's limited users do the right thing, but we're not doing this for
/proc/cpuinfo. For the in-kernel users, grepping shows some suspicious
looking things, but how many are problematic I do not yet know. As an
example, does
static void kvm_riscv_vcpu_update_config(const unsigned long *isa)
{
	u64 henvcfg = 0;

	if (riscv_isa_extension_available(isa, SVPBMT))
		henvcfg |= ENVCFG_PBMTE;

work correctly if the SVPBMT Kconfig option is disabled?
From a quick check, `isa` here is set from the host ISA
	/* Setup ISA features available to VCPU */
	for (i = 0; i < ARRAY_SIZE(kvm_isa_ext_arr); i++) {
		host_isa = kvm_isa_ext_arr[i];
		if (__riscv_isa_extension_available(NULL, host_isa) &&
		    kvm_riscv_vcpu_isa_enable_allowed(i))
			set_bit(host_isa, vcpu->arch.isa);
	}
so if the check passes for the host ISA, it'll pass for the guest ISA
too, so henvcfg will end up with the PBMTE bit set. I just haven't yet
checked what the outcome of this will be, but I figure not good?

> (it'd be good to have a consistent API for these types of
>    checks which combine extension presence with other information)

I figure the best option might just be to make the
__riscv_isa_extension_available() into this to avoid disruption? The
vast majority of users of that function want to know whether or not it
is safe to use the extension on that hart, not whether the hart itself
supports it. In fact, outside of Evan's per-hart stuff in /proc/cpuinfo,
do we have any users that would even want an API that checks the
"unfiltered" versions? I'll have to check the users to answer that I
think.

>  - The kernel will expose sanitized extension information to userspace.
>    S-mode only extensions do not need to be exposed and extensions that
>    require kernel support should only be exposed when the kernel is
>    willing and able to support them. Additional properties of extensions,
>    such as block sizes, also need to be exposed (exposure is done through
>    hwprobe)

It gets extra messy too when the prctl etc get involved. What do you
report in /proc/cpuinfo's isa field when RISCV_ISA_V_DEFAULT_ENABLE is
set to disabled? It would have to respect the sysctl, right?
Ditto the prctl, which will interact "nicely" with things like Google's
cpu_feature library that parses /proc/cpuinfo.
	https://github.com/google/cpu_features

> I think we have most/all the above already, but the audits you've
> volunteered to do would be welcome, along with maybe documenting
> it all somewhere. Then, after that, all that's left is the "what to
> do with /proc/cpuinfo and hwcap?" question.

> Can we simply freeze
> what extensions are exposed through them and mark them as deprecated?

hwcap is already limited to the single-letter stuff, and is already
disabling things like F, D & V if the config options are not set.

But for /proc/cpuinfo, that is sorta where I was wondering if we should
go, but hwprobe isn't a replacement for "casual" checking of what
is/isn't supported, so it may be premature. Certainly, leaving people
without an easy way to check by eye what extensions are present seems
overly painful.

Cheers,
Conor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20230726/4fa21d59/attachment.sig>


More information about the linux-riscv mailing list