[RFC PATCH v2 4/4] arm64: Export id_aar64fpr0 via sysfs

Wed Oct 21 13:19:48 EDT 2020

On Wed, Oct 21, 2020 at 05:18:37PM +0100, Catalin Marinas wrote:
> On Wed, Oct 21, 2020 at 04:37:38PM +0100, Will Deacon wrote:
> > On Wed, Oct 21, 2020 at 04:10:06PM +0100, Catalin Marinas wrote:
> > > On Wed, Oct 21, 2020 at 03:45:43PM +0100, Will Deacon wrote:
> > > > On Wed, Oct 21, 2020 at 03:09:46PM +0100, Catalin Marinas wrote:
> > > > > Anyway, if the task placement is entirely off the table, the next thing
> > > > > is asking applications to set their own mask and kill them if they do
> > > > > the wrong thing. Here I see two possibilities for killing an app:
> > > > > 
> > > > > 1. When it ends up scheduled on a non-AArch32-capable CPU
> > > > 
> > > > That sounds fine to me. If we could do the exception return and take a
> > > > SIGILL, that's what we'd do, but we can't so we have to catch it before.
> > > 
> > > Indeed, the illegal ERET doesn't work for this scenario.
> > > 
> > > > > 2. If the user cpumask (bar the offline CPUs) is not a subset of the
> > > > >    aarch32_mask
> > > > > 
> > > > > Option 1 is simpler but 2 would be slightly more consistent.
> > > > 
> > > > I disagree -- if we did this for something like fpsimd, then the consistent
> > > > behaviour would be to SIGILL on the cores without the instructions.
> > > 
> > > For fpsimd it makes sense since the main ISA is still available and the
> > > application may be able to do something with the signal. But here we
> > > can't do much since the entire AArch32 mode is not supported. That's why
> > > we went for SIGKILL instead of SIGILL but thinking of it, after execve()
> > > the signals are reset to SIG_DFL so SIGILL cannot be ignored.
> > > 
> > > I think it depends on whether you look at this fault as a part of ISA
> > > not being available or as the overall application not compatible with
> > > the system it is running on. If the latter, option 2 above makes more
> > > sense.
> > 
> > Hmm, I'm not sure I see the distinction in practice: you still have a binary
> > application that cannot run on all CPUs in the system. Who cares if some of
> > the instructions work?
> 
> The failure would be more predictable rather than the app running for a
> while and randomly getting SIGKILL. If it only fails on execve or
> sched_setaffinity, it may be easier to track down (well, there's the CPU
> hotplug as well that can change the cpumask intersection outside the
> user process control).

But it's half-baked, because the moment the 32-bit task changes its affinity
mask then you're back in the old situation. That's why I'm saying this
doesn't add anything, because the rest of the series is designed entirely
around delivering SIGKILL at the last minute rather than preventing us
getting to that situation in the first place. The execve() case feels to me
like we're considering doing something because we can, rather than because
it's actually useful.

> > > > > There's also the question on whether the kernel should allow an ELF32 to
> > > > > be loaded (and potentially killed subsequently) if the user mask is not
> > > > > correct on execve().
> > > > 
> > > > I don't see the point in distinguishing between "you did execve() on a core
> > > > without 32-bit" and "you did execve() on a core with 32-bit and then
> > > > migrated to a core without 32-bit".
> > > 
> > > In the context of option 2 above, its more about whether execve()
> > > returns -ENOEXEC or the process gets a SIGKILL immediately.
> > 
> > I just don't see what we gain by returning -ENOEXEC except for extra code
> > and behaviour in the ABI (and if you wanted consistentcy you'd also need
> > to fail attempts to widen the affinity mask to include 64-bit-only cores
> > from a 32-bit task).
> 
> The -ENOEXEC is more in line with the current behaviour not allowing
> ELF32 on systems that are not fully symmetric. So basically you'd have
> a global opt-in as sysctl and a per-application opt-in via the affinity
> mask.

I think it's a bit strong calling that an opt-in, as the application could
just happen to be using the right cpumask.

> I do agree that it complicates the kernel implementation.
> 
> > In other words, I don't think the kernel needs to hold userspace's hand
> > for an opt-in feature that requires userspace to handle scheduling for
> > optimal power/performance _anyway_. Allowing the affinity to be set
> > arbitrarily and then killing the task if it ends up trying to run on the
> > wrong CPU is both simple and sufficient.
> 
> Fine by me if you want to keep things simple, less code to maintain.
> 
> However, it would be good to know if the Android kernel/user guys are
> happy with this approach. If the Android kernel ends up carrying
> additional patches for task placement, I'd question why we need to merge
> this (partial) series at all.

Hmm, those folks aren't even on CC :(

Adding Suren and Marco...

Will