more /proc/cpuinfo & extension support related woes

Conor Dooley conor at kernel.org
Tue Jul 25 10:19:36 PDT 2023


Hey,

(I've only CCed a few people here, mostly those who were involved in the
riscv_has_extension_likely() stuff earlier this year)

I know using /proc/cpuinfo for extension detection is a bad idea, that's
not the point of this. We can go on about how it is bad, but people are
using it & we shouldn't knowingly lead them astray. For example, unlord
reported on IRC yesterday that, having built a kernel using a released
version of llvm, they were surprised to see illegal instruction
exceptions given /proc/cpuinfo had a v in the ISA string entry.

For most extensions that we support in the kernel at the moment, the
general flow is something along the lines of the DT etc gets parsed,
and then we set a bit if the hart supports the extension. For some
select extensions, that have additional properties in the DT, we perform
a check for those before setting the bit. Extensions that the kernel
does not understand get ignored as it does not know what to look for in
the isa string representation.

As a result, the kernel populates a list of neither the extensions
supported by the hart nor a list of the extensions supported by the
running kernel. Instead, its something awkward in the middle - a list
of extensions that a kernel built from this source code is aware of
available on this platform.

For the case where the system, but not the kernel, supports vector, we
do what looks like a weird dance*, where we set up vsize, but then mask
off the hwcap bit:

	if (elf_hwcap & COMPAT_HWCAP_ISA_V) {
		riscv_v_setup_vsize();
		/*
		 * ISA string in device tree might have 'v' flag, but
		 * CONFIG_RISCV_ISA_V is disabled in kernel.
		 * Clear V flag in elf_hwcap if CONFIG_RISCV_ISA_V is disabled.
		 */
		if (!IS_ENABLED(CONFIG_RISCV_ISA_V))
			elf_hwcap &= ~COMPAT_HWCAP_ISA_V;
	}
	* riscv_v_setup_vsize() w/o a return check should be a NOP when
	the config option is disabled.

Note that only the hwcap bit is masked, but not the corresponding
extension support tracking bit in riscv_isa. I'd swear I recall talking
to Andy about how this could be problematic w.r.t. false reporting of
some variety, I'll have to go find it. It is (I think) the same for any
of the other extensions parsed from DT that has a Kconfig option that
must be set before userspace can use it, such as F & D.

Either way, this becomes problematic for /proc/cpuinfo, where we do:
	static void print_isa_ext(struct seq_file *f)
	{
		struct riscv_isa_ext_data *edata;
		int i = 0, arr_sz;
	
		arr_sz = ARRAY_SIZE(isa_ext_arr) - 1;
	
		/* No extension support available */
		if (arr_sz <= 0)
			return;
	
		for (i = 0; i <= arr_sz; i++) {
			edata = &isa_ext_arr[i];
			if (!__riscv_isa_extension_available(NULL, edata->isa_ext_id))
				continue;
			seq_printf(f, "_%s", edata->uprop);
		}
	}

The issue here is the call to __riscv_isa_extension_available() which,
with a NULL first argument, will use the riscv_isa bitmask that contains,
as described above, a list of extensions that a kernel built from this
source code is aware of that are available on this platform.

This also impacts riscv_has_extension_[un]likely(), which operates on a
similar premise. has_vector(), has_svnapot & has_fpu(), which are the
users of riscv_has_extension_[un]likely() don't suffer here, because when
their config option is unset, they resolve to `return -EOPNOTSUPP;` or
similar.

Other users of __riscv_isa_extension_available() I have not evaluated
yet, but there are quite a few, scattered in various places, including
kvm & hwprobe.

I think we might've just overused this particular bit of information for
various purposes & screwed ourselves a bit. There are a few main places
this information is used:

1 Figuring what exactly the hardware supports during boot. Evan has
  added per-CPU tracking of this, although the kernel is limited here to
  the subset that it knows AND further restricted by a call to
  riscv_isa_extension_check(). This we don't actually use yet after
  boot, but Evan has a user on the lists. This per-CPU tracking is then
  used to create and LCD set of extensions that drives
  __riscv_isa_extension_available().

2 Telling userspace what extensions are supported. However, by what? The
  running-kernel & hw combo? The hardware? I tried to read the hwprobe
  doc and couldn't tell. /proc/cpuinfo, as I already said, is a
  mixed-mess, but hwprobe should mean "this extension is supported by
  the hardware (or firmware) and you can use this without a concern",
  right? Evan's proposed per-CPU reporting here adds a third thing to
  the mix, but is intended to (AFAIU) represent the set of
  known-to-the-kernel extensions that each hart supports, regardless of
  whether the kernel itself supports them.

3 Setting dynamic code paths in the kernel. For example, if the hardware
  supports Zbb, patch in the string manip routines. This is distinct
  from telling userspace it can use the extension, since something like
  Zbb can be used by userspace even if the kernel doesn't touch it.

4 Telling in-kernel users what extensions they can use. This, I figure,
  is more along the lines of my theorised meaning for hwprobe.

I think for case 3, we are mostly okay. Either this sort of code uses
alternatives, that have an in-built config dependency, or the
has_whatever() stuff that has an alternative implementation if the
correct config option is not set.

For case 2, I think hwprobe is doing the right thing (for now) but
/proc/cpuinfo needs to be changed.

Case 4 probably needs an audit to make sure that nothing is going
awry. I'll go have a look through the users.

If other people think splitting this up a worthwhile endeavour, I'll
go write some patches. I do FWIW. Or maybe I am overlooking something
that makes this less of a problem than I think. LMK if so :)

Thanks,
Conor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20230725/262d78d7/attachment.sig>


More information about the linux-riscv mailing list