Patch "arm64: mm: Batch dsb and isb when populating pgtables" has been added to the 6.6-stable tree
gregkh at linuxfoundation.org
gregkh at linuxfoundation.org
Thu Mar 19 04:12:51 PDT 2026
This is a note to let you know that I've just added the patch titled
arm64: mm: Batch dsb and isb when populating pgtables
to the 6.6-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
arm64-mm-batch-dsb-and-isb-when-populating-pgtables.patch
and it can be found in the queue-6.6 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable at vger.kernel.org> know about it.
>From stable+bounces-216826-greg=kroah.com at vger.kernel.org Tue Feb 17 14:35:05 2026
From: Ryan Roberts <ryan.roberts at arm.com>
Date: Tue, 17 Feb 2026 13:34:07 +0000
Subject: arm64: mm: Batch dsb and isb when populating pgtables
To: stable at vger.kernel.org
Cc: Ryan Roberts <ryan.roberts at arm.com>, catalin.marinas at arm.com, will at kernel.org, linux-arm-kernel at lists.infradead.org, linux-kernel at vger.kernel.org, Jack Aboutboul <jaboutboul at microsoft.com>, Sharath George John <sgeorgejohn at microsoft.com>, Noah Meyerhans <nmeyerhans at microsoft.com>, Jim Perrin <Jim.Perrin at microsoft.com>, Itaru Kitayama <itaru.kitayama at fujitsu.com>, Eric Chanudet <echanude at redhat.com>, Mark Rutland <mark.rutland at arm.com>, Ard Biesheuvel <ardb at kernel.org>
Message-ID: <20260217133411.2881311-3-ryan.roberts at arm.com>
From: Ryan Roberts <ryan.roberts at arm.com>
[ Upstream commit 1fcb7cea8a5f7747e02230f816c2c80b060d9517 ]
After removing uneccessary TLBIs, the next bottleneck when creating the
page tables for the linear map is DSB and ISB, which were previously
issued per-pte in __set_pte(). Since we are writing multiple ptes in a
given pte table, we can elide these barriers and insert them once we
have finished writing to the table.
Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
before | 78 (0%) | 435 (0%) | 1723 (0%) | 3779 (0%)
after | 11 (-86%) | 161 (-63%) | 656 (-62%) | 1654 (-56%)
Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama at fujitsu.com>
Tested-by: Eric Chanudet <echanude at redhat.com>
Reviewed-by: Mark Rutland <mark.rutland at arm.com>
Reviewed-by: Ard Biesheuvel <ardb at kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-3-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will at kernel.org>
[ Ryan: Trivial backport ]
Signed-off-by: Ryan Roberts <ryan.roberts at arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
---
arch/arm64/include/asm/pgtable.h | 7 ++++++-
arch/arm64/mm/mmu.c | 11 ++++++++++-
2 files changed, 16 insertions(+), 2 deletions(-)
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -262,9 +262,14 @@ static inline pte_t pte_mkdevmap(pte_t p
return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
}
-static inline void set_pte(pte_t *ptep, pte_t pte)
+static inline void set_pte_nosync(pte_t *ptep, pte_t pte)
{
WRITE_ONCE(*ptep, pte);
+}
+
+static inline void set_pte(pte_t *ptep, pte_t pte)
+{
+ set_pte_nosync(ptep, pte);
/*
* Only if the new pte is valid and kernel, otherwise TLB maintenance
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -175,7 +175,11 @@ static void init_pte(pte_t *ptep, unsign
do {
pte_t old_pte = READ_ONCE(*ptep);
- set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));
+ /*
+ * Required barriers to make this visible to the table walker
+ * are deferred to the end of alloc_init_cont_pte().
+ */
+ set_pte_nosync(ptep, pfn_pte(__phys_to_pfn(phys), prot));
/*
* After the PTE entry has been populated once, we
@@ -229,6 +233,11 @@ static void alloc_init_cont_pte(pmd_t *p
phys += next - addr;
} while (addr = next, addr != end);
+ /*
+ * Note: barriers and maintenance necessary to clear the fixmap slot
+ * ensure that all previous pgtable writes are visible to the table
+ * walker.
+ */
pte_clear_fixmap();
}
Patches currently in stable-queue which might be from ryan.roberts at arm.com are
queue-6.6/arm64-mm-don-t-remap-pgtables-per-cont-pte-pmd-block.patch
queue-6.6/arm64-mm-don-t-remap-pgtables-for-allocate-vs-populate.patch
queue-6.6/arm64-mm-batch-dsb-and-isb-when-populating-pgtables.patch
More information about the linux-arm-kernel
mailing list