[PATCH 2/3] arm64: mm: Handle LVA support as a CPU feature

Thu Dec 1 03:48:23 PST 2022

On Thu, Dec 01, 2022 at 12:22:43PM +0100, Ard Biesheuvel wrote:
> On Thu, 1 Dec 2022 at 12:14, Mark Rutland <mark.rutland at arm.com> wrote:
> >
> > On Wed, Nov 30, 2022 at 04:40:11PM +0000, Catalin Marinas wrote:
> > > On Wed, Nov 30, 2022 at 05:29:55PM +0100, Ard Biesheuvel wrote:
> > > > On Wed, 30 Nov 2022 at 17:28, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > > > On Wed, Nov 30, 2022 at 03:56:26PM +0100, Ard Biesheuvel wrote:
> > > > > > On Wed, 30 Nov 2022 at 15:50, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > > > > > > On Tue, Nov 15, 2022 at 03:38:23PM +0100, Ard Biesheuvel wrote:
> > > > > > > > Currently, we detect CPU support for 52-bit virtual addressing (LVA)
> > > > > > > > extremely early, before creating the kernel page tables or enabling the
> > > > > > > > MMU. We cannot override the feature this early, and so large virtual
> > > > > > > > addressing is always enabled on CPUs that implement support for it if
> > > > > > > > the software support for it was enabled at build time. It also means we
> > > > > > > > rely on non-trivial code in asm to deal with this feature.
> > > > > > > >
> > > > > > > > Given that both the ID map and the TTBR1 mapping of the kernel image are
> > > > > > > > guaranteed to be 48-bit addressable, it is not actually necessary to
> > > > > > > > enable support this early, and instead, we can model it as a CPU
> > > > > > > > feature. That way, we can rely on code patching to get the correct
> > > > > > > > TCR.T1SZ values programmed on secondary boot and suspend from resume.
> > > > > > > >
> > > > > > > > On the primary boot path, we simply enable the MMU with 48-bit virtual
> > > > > > > > addressing initially, and update TCR.T1SZ from C code if LVA is
> > > > > > > > supported, right before creating the kernel mapping. Given that TTBR1
> > > > > > > > still points to reserved_pg_dir at this point, updating TCR.T1SZ should
> > > > > > > > be safe without the need for explicit TLB maintenance.
> > > > > > >
> > > > > > > I'm not sure we can skip the TLBI here. There's some weird rule in the
> > > > > > > ARM ARM that if you change any of fields that may be cached in the TLB,
> > > > > > > maintenance is needed even if the MMU is off. From the latest version
> > > > > > > (I.a, I didn't dig into H.a),
> > > > > > >
> > > > > > > R_VNRFW:
> > > > > > >   When a System register field is modified and that field is permitted
> > > > > > >   to be cached in a TLB, software is required to invalidate all TLB
> > > > > > >   entries that might be affected by the field, at any address
> > > > > > >   translation stage in the translation regime even if the translation
> > > > > > >   stage is disabled, using the appropriate VMID and ASID, after any
> > > > > > >   required System register synchronization.
> > > > > >
> > > > > > Don't we already rely on this in cpu_set_default_tcr_t0sz() /
> > > > > > cpu_set_idmap_tcr_t0sz() ?
> > > > >
> > > > > Yeah, we do this and depending on how you read the above rule, we may
> > > > > need to move the local_flush_tlb_all() line after T0SZ setting.
> > > >
> > > > OK, so wouldn't this mean that we cannot change TxSZ at all without
> > > > going through a MMU off/on cycle?
> > >
> > > I don't think so. The way I see it is that the change is not guaranteed
> > > to take effect until we invalidate the TLBs. We don't risk fetching
> > > random stuff in the TLB since we have the reserved TTBR0 at that point.
> > > If the subsequent cpu_switch_mm() changed the ASID, in some
> > > interpretation of the ARM ARM we could have skipped the TLBI but that's
> > > not the case anyway.
> > >
> > > Looking for Mark R's opinion as I recall he talked to the architects in
> > > the past about this (though I think it was in the context of SCTLR_EL1).
> >
> > The architecture is unfortunately vague here.
> >
> > From prior discussions with architects, the general rule was "if it's permitted
> > to be cached in a TLB, an update must be followed by a TLBI for that update to
> > take effect, regardless of SCTLR_ELx.M". The architecture isn't very specific
> > about what scope fo maintenance is required (e.g. if certain fields are tagged
> > by ASID/VMID), but I beleive it's sufficient to use a (VM)ALL for the current
> > translation regime (which might be stronger than necessary).
> >
> > So for this case, my understanding is:
> >
> > 1) When changing TxSZ, we need a subsequent invalidate before the change is
> >    guaranteed to take effect. So I believe that what we do today isn't quite
> >    right.
> >
> > 2) During the window between writing to TxSZ and completing the invalidation,
> >    I'm not sure how the MMU is permitted to behave w.r.t. intermediate walk
> >    entries. I could imagine that those become (CONSTRAINED) UNPREDICTABLE , and
> >    that we might need to ensure those are invalidated (or inactive and prior
> >    walks completed) before we write to TCR_EL1.
> >
> 
> OK, so that would at least mean that we can modify T1SZ while TTBR1 is
> pointing to reserved_pg_dir without running the risk of TLB conflicts,
> right? (and similarly for TTBR0). 

That is my understanding, yes.

> Given that we never modify TxSZ on a hot path, sprinkling some bonus TLBIs
> (and pivoting via reserved_pg_dir if needed) is hardly an issue here.

Agreed.

> > I can go chase this up with the architects; in the mean time my thinking would
> > be that we should retain the existing maintenance.
> 
> Yes, that would be helpful, thanks.

Cool; I\ll try to do that shortly, and will keep you updated.

Thanks,
Mark.