[PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory

Wed May 20 05:29:57 PDT 2026

Hi Andrew,

I've reviewed all the Sashiko findings:

- Patch 2 (fls() truncation risk): Will fix. Replace fls() with
  __fls() to accept unsigned long directly.

- Patch 4 (nr overflow risk): Pre-existing type choice.

- Patch 4 (missing NULL check before page_to_phys): Will fix.
  Add defensive checks consistent with vmap_pages_pte_range().

- Patch 5 (flush_cache_vmap with empty range): Valid point. Will
  save the original start address and use it for the final flush.

- Patch 5 (virtual address alignment not checked): Addressed by
  Patch 6 in this series.

- Patch 6 (caller tracking loss and while(1) loop): Valid point.
  Will pass caller as a parameter and restructure per Uladzislau's
  suggestion to replace while(1) with explicit sequential attempts.

- Patch 7 (partial cache flush on early break): Same root cause as
  the Patch 5 flush issue.

Will resend V3 shortly.

Thanks,
Wen

On Wed, 20 May 2026 at 04:17, Andrew Morton <akpm at linux-foundation.org> wrote:
>
> On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi at gmail.com> wrote:
>
> > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > is physically fully or partially contiguous.
> >
> > ...
> >
> > On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> > the performance CPUfreq policy enabled, benchmark results:
> >
> > * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> > * vmalloc(1 MB) mapping time (excluding allocation) with
> >   VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> > * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
>
> Nice.
>
> AI review found a bunch of things to ask about:
>         https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
>
> It doesn't appear that you'll be getting any more review on this
> series, so please check the above questions and resend?
>