[PATCH] arm64: guard AMU register access with ARM64_HAS_AMU_EXTN

Fri Oct 24 02:27:50 PDT 2025

On Thu, 23 Oct 2025 16:58:49 +0100,
Marek Vasut <marek.vasut at mailbox.org> wrote:
> 
> On 10/23/25 4:19 PM, Marc Zyngier wrote:
> 
> Hello Marc,
> 
> >> Except right now, I still trigger the AMU faults even with
> >> ARM64_HAS_AMU_EXTN=n , which I think should not happen ?
> > 
> > ARM64_HAS_AMU_EXTN is a *capability*, not a configuration.
> > CONFIG_ARM64_AMU_EXTN is the configuration. I have the feeling you're
> > mixing the two.
> > 
> > Irrespective of the configuration, we access the AMU registers
> > depending on the what is advertised, because we *must* make these
> > registers inaccessible from EL0, no matter what.
> 
> Ahhh, I was missing this part, thank you for clarifying.
> 
> >> I would much rather be able to disable ARM64_HAS_AMU_EXTN in kernel
> >> config for the old devices with old firmware, without triggering the
> >> faults ... and say that everything which is going to be upstream will
> >> always use new firmware that has proper working AMU support.
> > 
> > No, that's the wrong approach. If you leave the AMU accessible to EL0,
> > you're leaking data to userspace, and that's pretty wrong, no matter
> > how you look at it.
> > 
> > I also think your hack works by pure luck, because at the point where
> > your CPUs are booting, the alternatives are yet not in place (the
> > kernel patching happens much later). In short, this breaks
> > *everything*.
> > 
> > As I indicated before, you have two options:
> > 
> > - either you update your firmware and leave the kernel alone
> > 
> > - or you implement the workaround as ID register override so that you
> >    *must* pass something on the kernel command line to boot, and by
> >    then accept that you will leak critical timing information to
> >    userspace.
> > 
> > Any other option, including guarding the macro with a config option is
> > *not* acceptable.
> 
> Since I am getting an exception when I access the AMU register, would
> it be possible to trap that exception, and report something to the
> user instead of outright crashing with no output ?

The trap exists, and the exception is being routed to EL3. There is
nothing you can do about that if running at EL2, and if at EL1, you'd
need to take the trap to EL2 to handle it. And if you can do that,
what do you do?  Not doing anything is wrong, and doing something will
nuke your machine.

> Similar to what Linux already does on the various speculative
> execution bugs on x86, something like this?
> 
> "
> MDS CPU bug present and SMT on, data leak possible. See
> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
> for more details.
> "

You're completely off base. The problem at hand has nothing to do with
speculation, and everything to do with access permission to counter
registers.

I also wouldn't be surprised if you could take your whole machine down
from userspace just by ticking some of the AM*_EL0 registers (the
pseudocode clearly shows that there is a route to EL3 in this case).

Honestly, I think you should stop trying to papering over this issue
behind the user's back. If you want this addressed, do it so that the
user knows their machine is fsck'd, and that they are OK with that. Do
it by implementing an ID register override that requires a kernel
command-line argument.

Do I sound like a stuck record? Probably. But that's IMO the only
acceptable solution for what you have. I'm looking forward to
reviewing a patch implementing that suggestion, but I'll stop even
thinking of how to paper over this in the way you suggest.

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.