[PATCH] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults

Piotr Jaroszynski pjaroszynski at nvidia.com
Tue Mar 3 13:40:34 PST 2026


On Mon, Mar 02, 2026 at 11:19:46PM -0800, James Houghton wrote:
> On Mon, Mar 2, 2026 at 10:38 PM Piotr Jaroszynski
> <pjaroszynski at nvidia.com> wrote:
> >
> > contpte_ptep_set_access_flags() compared the gathered ptep_get() value
> > against the requested entry to detect no-ops. ptep_get() ORs AF/dirty
> > from all sub-PTEs in the CONT block, so a dirty sibling can make the
> > target appear already-dirty. When the gathered value matches entry, the
> > function returns 0 even though the target sub-PTE still has PTE_RDONLY
> > set in hardware.
> >
> > For CPU page-table walks this is benign: with FEAT_HAFDBS the hardware
> > may set AF/dirty on any sub-PTE and the CPU TLB treats the gathered
> > result as authoritative for the entire range. But an SMMU without HTTU
> > (or with HA/HD disabled in CD.TCR) evaluates each descriptor
> > individually and will keep raising F_PERMISSION on the unchanged target
> > sub-PTE, causing an infinite fault loop.
> >
> > Gathering can therefore cause false no-ops when only a sibling has been
> > updated:
> >  - write faults: target still has PTE_RDONLY (needs PTE_RDONLY cleared)
> >  - read faults:  target still lacks PTE_AF
> >
> > Fix by checking all sub-PTEs' access flags individually (not via the
> > gathered view) before returning no-op, and use the raw target PTE for
> > the write-bit unfold decision. The access-flag mask matches the one
> > used by __ptep_set_access_flags().
> >
> > Per Arm ARM (DDI 0487) D8.7.1 ("The Contiguous bit"), any sub-PTE in a CONT
> > range may become the effective cached translation and software must
> > maintain consistent attributes across the range.
> >
> > Fixes: 4602e5757bcc ("arm64/mm: wire up PTE_CONT for user mappings")
> >
> > Reviewed-by: Alistair Popple <apopple at nvidia.com>
> > Cc: Ryan Roberts <ryan.roberts at arm.com>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > Cc: Will Deacon <will at kernel.org>
> > Cc: Jason Gunthorpe <jgg at nvidia.com>
> > Cc: John Hubbard <jhubbard at nvidia.com>
> > Cc: Zi Yan <ziy at nvidia.com>
> > Cc: Breno Leitao <leitao at debian.org>
> > Cc: stable at vger.kernel.org
> > Signed-off-by: Piotr Jaroszynski <pjaroszynski at nvidia.com>
> 
> Thanks for the fix!
> 
> This is similar (sort of) to a HugeTLB page fault loop I stumbled upon
> a while ago[1]. (I wonder if there have been more cases like this.)

I see that your commit 3c0696076aad ("arm64: mm: Always make sw-dirty
PTEs hw-dirty in pte_modify") from that discussion was picked up and
it's very relevant for the hugetlb exposure question. With your patch,
do we have a guarantee that sw-dirty implies hw-dirty in all cases? If
yes, then there should be no exposure for that path. But it still makes
sense to make it more explicit.

> 
> Feel free to add:
> 
> Reviewed-by: James Houghton <jthoughton at google.com>

Thanks!

> 
> [1] https://lore.kernel.org/all/20231204172646.2541916-1-jthoughton@google.com



More information about the linux-arm-kernel mailing list