[PATCH v1] arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context

Tue Nov 4 04:41:10 PST 2025

Hey Ryan,

On Mon, Nov 03, 2025 at 04:28:44PM +0000, Ryan Roberts wrote:
> On 03/11/2025 15:37, Will Deacon wrote:
> > On Mon, Nov 03, 2025 at 12:57:37PM +0000, Ryan Roberts wrote:
> >> +static int range_split_to_ptes(unsigned long start, unsigned long end, gfp_t gfp)
> >> +{
> >> +	int ret;
> >> +
> >> +	arch_enter_lazy_mmu_mode();
> >> +	ret = walk_kernel_page_table_range_lockless(start, end,
> >> +					&split_to_ptes_ops, NULL, &gfp);
> >> +	arch_leave_lazy_mmu_mode();
> > 
> > Why are you entering/leaving lazy mode now? linear_map_split_to_ptes()
> > calls flush_tlb_kernel_range() right after this so now it looks like
> > we have more barriers than we need there.
> 
> Without the lazy mmu block, every write to every pte (or pmd/pud) will cause a
> dsb and isb to be emitted. With the lazy mmu block, we only emit a single
> dsb/isb at the end of the block.
> 
> linear_map_split_to_ptes() didn't previously have a lazy mmu block; that was an
> oversight, I believe. So when refactoring I thought it made sense to make it
> common for both cases.
> 
> Yes, the flush_tlb_kernel_range() also has the barriers, so the lazy mmu mode is
> reducing from a gazillion barriers to 2. We could further optimize from 2 to 1,
> but I doubt the performance improvement will be measurable.
> 
> Perhaps I've misunderstood your point...?

I was just trying to understand whether this was a functional thing (which
I couldn't grok) or an optimisation. Sounds like it's the latter, but I'd
prefer not to mix optimisations with fixes.

Will