[PATCH] [RFC] arm64: mmu: use range based TLB flushing when hot unplugging memory
Balbir Singh
balbirs at nvidia.com
Thu May 21 03:46:02 PDT 2026
On Thu, May 21, 2026 at 09:50:04AM +0100, Ryan Roberts wrote:
> On 21/05/2026 05:24, Alistair Popple wrote:
> > Hot unplugging memory on ARM64 requires a TLB invalidate after unmapping
> > the page to be hot unplugged from the direct map. Currently that happens
> > one page at a time, meaning range based invalidates cannot be used. The
> > result of this is that removing large amounts of memory takes a long
> > time and in some cases can trigger an RCU stall warning.
> >
> > For example on one system hot unplugging 480GB of memory takes ~1
> > minute. With this change the same operation took ~1 second, a 60x
> > improvement.
> >
> > Signed-off-by: Alistair Popple <apopple at nvidia.com>
> >
> > ---
> >
> > This is an RFC, because I'm not sure the change is correct as it frees
> > the PTE page before flushing the TLB. I'm not familiar enough with ARM64
> > architecture to be sure this is safe, for example I don't know if HW
> > can update PTE bits such as access/dirty in the page through a stale
> > TLB entry.
> >
> > If so this would open a window during which the page is free but could
> > still be written to. Likely the safe option would be to collect all the
> > pages to be free on a list and free them after doing the range based TLB
> > flush, but wanted to get feedback on the approach before implementing it
> > which is the goal of this RFC.
>
> Hi Alistair,
>
> This patch doesn't apply on v7.1-rc4 because it conflicts with this patch:
>
> Commit 48478b9f79137 ("arm64/mm: Enable batched TLB flush in unmap_hotplug_range()")
>
> which has a very similar performance improvement, so hopefully it solves your
> problem?
>
> There are two paths which use this logic; unmapping the linear map and unmapping
> the corresponding vmemmap. In the latter case, the memory is also freed, so we
> can't safely do the range optimizaiton there since the TLB needs to be flushed
> before freeing the memory. But the linear map is the big, slow bit so hopefully
> it's sufficent for you?
>
I assume vmemmap path is for tearing down the struct pages corresponding
to the physical memory and vmemmap teardowns taking a flush should be
OK. It is worth checking if the issue is already fixed.
Balbir
<snip>
More information about the linux-arm-kernel
mailing list