[PATCH] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults

Piotr Jaroszynski pjaroszynski at nvidia.com
Wed Mar 4 09:16:58 PST 2026


On Wed, Mar 04, 2026 at 11:39:49AM -0400, Jason Gunthorpe wrote:
> On Wed, Mar 04, 2026 at 03:01:51PM +0000, Catalin Marinas wrote:
> > Good point. For the AF bit, the hardware is not allowed to cache it in
> > the TLB, so we can't get an AF fault for an unrelated VA nearby.
> 
> The way we have read the spec is there is no restriction on what PTE
> the HW accesses when it encounters a CONT group.
> 
> To be concrete, the spec seems to say it is legal to make HW that
> fetches the PTE at the VA, sees the CONT bit, and then always fetches
> the 0th PTE from the group and only uses that for permission checks.
> 
> Therefore SW should never assume that HW will read any particular
> sub-PTE under any scenario.
> 
> It seems current cores don't do this, and it is a bit silly to do, but
> I can imagine an optimizion where the core does a cache line fetch to
> read the PTE so it can freely snap to the PTE at the start of the
> cache line for permission checks. Consolidating permission storage to
> fewer PTEs would reduce atomic memory traffic if the TLB is thrashing.

"The Contiguous bit" section I quoted in the change says this:
 The entry is permitted to be cached in a TLB as though it is one of a
 number of adjacent translation table entries that point to a contiguous
 OA range with consistent attributes and permissions.

 Software is required to ensure that all of the adjacent translation
 table entries for the contiguous region point to a contiguous OA range
 with consistent attributes and permissions.

I think your example is valid as any of the sub-PTEs can be cached.
Another valid example is that first you access addr A and PTE for A gets
cached as the value for the whole 2MB region. Then you access a
different address B within the region and fault based on the cached
attributes. In this case the SMMU never had to read the PTE for B as it
already cached it when accessing A. If the faulting code only read the
PTE for B it can show that e.g. RDONLY was already cleared and hit the
problem again.

In summary, I don't see a way to skip reading and fixing all the
sub-PTEs. And the previous code is already reading all of them so the
fix is not adding any new overhead.

> 
> Jason





More information about the linux-arm-kernel mailing list