[PATCH v3 5/6] mm/vmalloc: map contiguous pages in batches for vmap() if possible

Wen Jiang jiangwenxiaomi at gmail.com
Wed May 27 20:42:06 PDT 2026


On Wed, 27 May 2026 at 16:28, Dev Jain <dev.jain at arm.com> wrote:
>
>
>
> On 22/05/26 11:01 am, Wen Jiang wrote:
> > From: "Barry Song (Xiaomi)" <baohua at kernel.org>
> >
> > In many cases, the pages passed to vmap() may include high-order
> > pages. For example, the systemheap often allocates pages in descending
> > order: order 8, then 4, then 0. Currently, vmap() iterates over every
> > page individually—even pages inside a high-order block are handled
> > one by one.
> >
> > This patch detects physically contiguous pages (regardless of whether
> > they are compound or non-compound) by scanning with
> > num_pages_contiguous(), and maps them as a single contiguous block
> > whenever possible. The first page's pfn must be aligned to the
> > mapping order for the batched mapping to be used.
> >
> > Pages with the same page_shift are coalesced and mapped via
> > vmap_pages_range_noflush_walk() to avoid page table rewalk.
> >
> > As users typically allocate memory in descending orders (e.g.
> > 8 → 4 → 0), once an order-0 page is encountered, we stop scanning
> > for contiguous pages since subsequent pages are likely order-0 as well.
> >
> > Signed-off-by: Barry Song (Xiaomi) <baohua at kernel.org>
> > Co-developed-by: Dev Jain <dev.jain at arm.com>
> > Signed-off-by: Dev Jain <dev.jain at arm.com>
> > Signed-off-by: Wen Jiang <jiangwen6 at xiaomi.com>
> > Tested-by: Xueyuan Chen <xueyuan.chen21 at gmail.com>
> > ---
> >  mm/vmalloc.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 80 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index deb764abc0571..50642246f4d40 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3542,6 +3542,84 @@ void vunmap(const void *addr)
> >  }
> >  EXPORT_SYMBOL(vunmap);
> >
> > +static inline int get_vmap_batch_order(struct page **pages,
> > +             unsigned int max_steps, unsigned int idx)
> > +{
> > +     unsigned int nr_contig;
> > +     int order;
> > +
> > +     if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) ||
> > +                     ioremap_max_page_shift == PAGE_SHIFT)
>
>
> Why bail out on ioremap_max_page_shift == PAGE_SHIFT? The code
> path for ioremap is different from vmap right?
>
>

ioremap_max_page_shift is under CONFIG_HAVE_ARCH_HUGE_VMAP which
controls both ioremap and vmap huge mappings.

> > +             return 0;
> > +
> > +     nr_contig = num_pages_contiguous(&pages[idx], max_steps);
> > +     if (nr_contig < 2)
> > +             return 0;
> > +
> > +     order = fls(nr_contig) - 1;
> > +
> > +     if (arch_vmap_pte_supported_shift(PAGE_SIZE << order) == PAGE_SHIFT)
> > +             return 0;
> > +
> > +     /* Ensure the first page's pfn is aligned to the order */
> > +     if (!IS_ALIGNED(page_to_pfn(pages[idx]), 1 << order))
> > +             return 0;
> > +
> > +     return order;
> > +}
> > +
> > +static int vmap_batched(unsigned long addr, unsigned long end,
> > +             pgprot_t prot, struct page **pages)
> > +{
> > +     unsigned int count = (end - addr) >> PAGE_SHIFT;
> > +     unsigned int prev_shift = 0, idx = 0;
> > +     unsigned long start = addr, map_addr = addr;
> > +     int err;
> > +
> > +     err = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
> > +                                             PAGE_SHIFT, GFP_KERNEL);
> > +     if (err)
> > +             goto out;
> > +
> > +     for (unsigned int i = 0; i < count; ) {
> > +             unsigned int shift = PAGE_SHIFT +
> > +                     get_vmap_batch_order(pages, count - i, i);
> > +
> > +             if (!i)
> > +                     prev_shift = shift;
> > +
> > +             if (shift != prev_shift) {
> > +                     err = vmap_pages_range_noflush_walk(map_addr, addr,
>
> It would be worth documenting vmap_pages_range_noflush_walk() that
> it can take an array of pages which are not all contiguous, but it
> may have contiguous chunks, as hinted by page_shift.
>
> Otherwise this looks good.
>
> > +                                     prot, pages + idx,
> > +                                     min(prev_shift, PMD_SHIFT));
> > +                     if (err)
> > +                             goto out;
> > +                     prev_shift = shift;
> > +                     map_addr = addr;
> > +                     idx = i;
> > +             }
> > +
> > +             /*
> > +              * Once small pages are encountered, the remaining pages
> > +              * are likely small as well.
> > +              */
> > +             if (shift == PAGE_SHIFT)
> > +                     break;
> > +
> > +             addr += 1UL << shift;
> > +             i += 1U << (shift - PAGE_SHIFT);
> > +     }
> > +
> > +     /* Remaining */
> > +     if (map_addr < end)
> > +             err = vmap_pages_range_noflush_walk(map_addr, end,
> > +                             prot, pages + idx, min(prev_shift, PMD_SHIFT));
> > +
> > +out:
> > +     flush_cache_vmap(start, end);
> > +     return err;
> > +}
> > +
> >  /**
> >   * vmap - map an array of pages into virtually contiguous space
> >   * @pages: array of page pointers
> > @@ -3585,8 +3663,8 @@ void *vmap(struct page **pages, unsigned int count,
> >               return NULL;
> >
> >       addr = (unsigned long)area->addr;
> > -     if (vmap_pages_range(addr, addr + size, pgprot_nx(prot),
> > -                             pages, PAGE_SHIFT) < 0) {
> > +     if (vmap_batched(addr, addr + size, pgprot_nx(prot),
> > +                             pages) < 0) {
> >               vunmap(area->addr);
> >               return NULL;
> >       }
>



More information about the linux-arm-kernel mailing list