[PATCH] arm64: contpte: fix set_access_flags() no-op check for SMMU/ATS faults
Piotr Jaroszynski
pjaroszynski at nvidia.com
Wed Mar 4 09:16:58 PST 2026
On Wed, Mar 04, 2026 at 11:39:49AM -0400, Jason Gunthorpe wrote:
> On Wed, Mar 04, 2026 at 03:01:51PM +0000, Catalin Marinas wrote:
> > Good point. For the AF bit, the hardware is not allowed to cache it in
> > the TLB, so we can't get an AF fault for an unrelated VA nearby.
>
> The way we have read the spec is there is no restriction on what PTE
> the HW accesses when it encounters a CONT group.
>
> To be concrete, the spec seems to say it is legal to make HW that
> fetches the PTE at the VA, sees the CONT bit, and then always fetches
> the 0th PTE from the group and only uses that for permission checks.
>
> Therefore SW should never assume that HW will read any particular
> sub-PTE under any scenario.
>
> It seems current cores don't do this, and it is a bit silly to do, but
> I can imagine an optimizion where the core does a cache line fetch to
> read the PTE so it can freely snap to the PTE at the start of the
> cache line for permission checks. Consolidating permission storage to
> fewer PTEs would reduce atomic memory traffic if the TLB is thrashing.
"The Contiguous bit" section I quoted in the change says this:
The entry is permitted to be cached in a TLB as though it is one of a
number of adjacent translation table entries that point to a contiguous
OA range with consistent attributes and permissions.
Software is required to ensure that all of the adjacent translation
table entries for the contiguous region point to a contiguous OA range
with consistent attributes and permissions.
I think your example is valid as any of the sub-PTEs can be cached.
Another valid example is that first you access addr A and PTE for A gets
cached as the value for the whole 2MB region. Then you access a
different address B within the region and fault based on the cached
attributes. In this case the SMMU never had to read the PTE for B as it
already cached it when accessing A. If the faulting code only read the
PTE for B it can show that e.g. RDONLY was already cleared and hit the
problem again.
In summary, I don't see a way to skip reading and fixing all the
sub-PTEs. And the previous code is already reading all of them so the
fix is not adding any new overhead.
>
> Jason
More information about the linux-arm-kernel
mailing list