more /proc/cpuinfo & extension support related woes

Andrew Jones ajones at ventanamicro.com
Wed Jul 26 02:04:39 PDT 2023


On Tue, Jul 25, 2023 at 06:19:36PM +0100, Conor Dooley wrote:
> Hey,
> 
> (I've only CCed a few people here, mostly those who were involved in the
> riscv_has_extension_likely() stuff earlier this year)
> 
> I know using /proc/cpuinfo for extension detection is a bad idea, that's
> not the point of this. We can go on about how it is bad, but people are
> using it & we shouldn't knowingly lead them astray. For example, unlord
> reported on IRC yesterday that, having built a kernel using a released
> version of llvm, they were surprised to see illegal instruction
> exceptions given /proc/cpuinfo had a v in the ISA string entry.
> 
> For most extensions that we support in the kernel at the moment, the
> general flow is something along the lines of the DT etc gets parsed,
> and then we set a bit if the hart supports the extension. For some
> select extensions, that have additional properties in the DT, we perform
> a check for those before setting the bit. Extensions that the kernel
> does not understand get ignored as it does not know what to look for in
> the isa string representation.
> 
> As a result, the kernel populates a list of neither the extensions
> supported by the hart nor a list of the extensions supported by the
> running kernel. Instead, its something awkward in the middle - a list
> of extensions that a kernel built from this source code is aware of
> available on this platform.
> 
> For the case where the system, but not the kernel, supports vector, we
> do what looks like a weird dance*, where we set up vsize, but then mask
> off the hwcap bit:
> 
> 	if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
> 		riscv_v_setup_vsize();
> 		/*
> 		 * ISA string in device tree might have 'v' flag, but
> 		 * CONFIG_RISCV_ISA_V is disabled in kernel.
> 		 * Clear V flag in elf_hwcap if CONFIG_RISCV_ISA_V is disabled.
> 		 */
> 		if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
> 			elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
> 	}
> 	* riscv_v_setup_vsize() w/o a return check should be a NOP when
> 	the config option is disabled.
> 
> Note that only the hwcap bit is masked, but not the corresponding
> extension support tracking bit in riscv_isa. I'd swear I recall talking
> to Andy about how this could be problematic w.r.t. false reporting of
> some variety, I'll have to go find it. It is (I think) the same for any
> of the other extensions parsed from DT that has a Kconfig option that
> must be set before userspace can use it, such as F & D.
> 
> Either way, this becomes problematic for /proc/cpuinfo, where we do:
> 	static void print_isa_ext(struct seq_file *f)
> 	{
> 		struct riscv_isa_ext_data *edata;
> 		int i = 0, arr_sz;
> 	
> 		arr_sz = ARRAY_SIZE(isa_ext_arr) - 1;
> 	
> 		/* No extension support available */
> 		if (arr_sz <= 0)
> 			return;
> 	
> 		for (i = 0; i <= arr_sz; i++) {
> 			edata = &isa_ext_arr[i];
> 			if (!__riscv_isa_extension_available(NULL, edata->isa_ext_id))
> 				continue;
> 			seq_printf(f, "_%s", edata->uprop);
> 		}
> 	}
> 
> The issue here is the call to __riscv_isa_extension_available() which,
> with a NULL first argument, will use the riscv_isa bitmask that contains,
> as described above, a list of extensions that a kernel built from this
> source code is aware of that are available on this platform.
> 
> This also impacts riscv_has_extension_[un]likely(), which operates on a
> similar premise. has_vector(), has_svnapot & has_fpu(), which are the
> users of riscv_has_extension_[un]likely() don't suffer here, because when
> their config option is unset, they resolve to `return -EOPNOTSUPP;` or
> similar.
> 
> Other users of __riscv_isa_extension_available() I have not evaluated
> yet, but there are quite a few, scattered in various places, including
> kvm & hwprobe.
> 
> I think we might've just overused this particular bit of information for
> various purposes & screwed ourselves a bit. There are a few main places
> this information is used:
> 
> 1 Figuring what exactly the hardware supports during boot. Evan has
>   added per-CPU tracking of this, although the kernel is limited here to
>   the subset that it knows AND further restricted by a call to
>   riscv_isa_extension_check(). This we don't actually use yet after
>   boot, but Evan has a user on the lists. This per-CPU tracking is then
>   used to create and LCD set of extensions that drives
>   __riscv_isa_extension_available().
> 
> 2 Telling userspace what extensions are supported. However, by what? The
>   running-kernel & hw combo? The hardware? I tried to read the hwprobe
>   doc and couldn't tell. /proc/cpuinfo, as I already said, is a
>   mixed-mess, but hwprobe should mean "this extension is supported by
>   the hardware (or firmware) and you can use this without a concern",
>   right? Evan's proposed per-CPU reporting here adds a third thing to
>   the mix, but is intended to (AFAIU) represent the set of
>   known-to-the-kernel extensions that each hart supports, regardless of
>   whether the kernel itself supports them.
> 
> 3 Setting dynamic code paths in the kernel. For example, if the hardware
>   supports Zbb, patch in the string manip routines. This is distinct
>   from telling userspace it can use the extension, since something like
>   Zbb can be used by userspace even if the kernel doesn't touch it.
> 
> 4 Telling in-kernel users what extensions they can use. This, I figure,
>   is more along the lines of my theorised meaning for hwprobe.
> 
> I think for case 3, we are mostly okay. Either this sort of code uses
> alternatives, that have an in-built config dependency, or the
> has_whatever() stuff that has an alternative implementation if the
> correct config option is not set.
> 
> For case 2, I think hwprobe is doing the right thing (for now) but
> /proc/cpuinfo needs to be changed.
> 
> Case 4 probably needs an audit to make sure that nothing is going
> awry. I'll go have a look through the users.
> 
> If other people think splitting this up a worthwhile endeavour, I'll
> go write some patches. I do FWIW. Or maybe I am overlooking something
> that makes this less of a problem than I think. LMK if so :)
> 
> Thanks,
> Conor.

Hi Conor,

Here's a summary of my thoughts, which probably aren't adding anything new
to this

 - The kernel needs per-hart bitmaps expressing what extensions are
   supported for each given hart.

 - The kernel needs a bitmap expressing the LCD of extensions across all
   harts.

 (The two bitmaps are clearly expressing the intersection of what's
  supported by a hart and what the kernel is aware of, since the kernel
  must allocate the bits.)

 - Extensions in DT/ACPI that the kernel is unaware of get ignored by
   the kernel and not exposed to userspace (if the extensions are for
   U-mode and U-mode has another way of determining their presence to
   use them, then good for U-mode)

 - The kernel will combine information from one of the two bitmaps listed
   above with other information, such as config information, to decide
   when / if an extension should be used by the kernel and/or exposed to
   userspace (it'd be good to have a consistent API for these types of
   checks which combine extension presence with other information)

 - The kernel will expose sanitized extension information to userspace.
   S-mode only extensions do not need to be exposed and extensions that
   require kernel support should only be exposed when the kernel is
   willing and able to support them. Additional properties of extensions,
   such as block sizes, also need to be exposed (exposure is done through
   hwprobe)

I think we have most/all the above already, but the audits you've
volunteered to do would be welcome, along with maybe documenting
it all somewhere. Then, after that, all that's left is the "what to
do with /proc/cpuinfo and hwcap?" question. Can we simply freeze
what extensions are exposed through them and mark them as deprecated?

Thanks,
drew



More information about the linux-riscv mailing list