[PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
Wen Jiang
jiangwenxiaomi at gmail.com
Wed May 20 05:29:57 PDT 2026
Hi Andrew,
I've reviewed all the Sashiko findings:
- Patch 2 (fls() truncation risk): Will fix. Replace fls() with
__fls() to accept unsigned long directly.
- Patch 4 (nr overflow risk): Pre-existing type choice.
- Patch 4 (missing NULL check before page_to_phys): Will fix.
Add defensive checks consistent with vmap_pages_pte_range().
- Patch 5 (flush_cache_vmap with empty range): Valid point. Will
save the original start address and use it for the final flush.
- Patch 5 (virtual address alignment not checked): Addressed by
Patch 6 in this series.
- Patch 6 (caller tracking loss and while(1) loop): Valid point.
Will pass caller as a parameter and restructure per Uladzislau's
suggestion to replace while(1) with explicit sequential attempts.
- Patch 7 (partial cache flush on early break): Same root cause as
the Patch 5 flush issue.
Will resend V3 shortly.
Thanks,
Wen
On Wed, 20 May 2026 at 04:17, Andrew Morton <akpm at linux-foundation.org> wrote:
>
> On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi at gmail.com> wrote:
>
> > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > is physically fully or partially contiguous.
> >
> > ...
> >
> > On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> > the performance CPUfreq policy enabled, benchmark results:
> >
> > * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> > * vmalloc(1 MB) mapping time (excluding allocation) with
> > VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> > * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
>
> Nice.
>
> AI review found a bunch of things to ask about:
> https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
>
> It doesn't appear that you'll be getting any more review on this
> series, so please check the above questions and resend?
>
More information about the linux-arm-kernel
mailing list