Report: Performance regression from ib_umem_get on zone device pages
jane.chu at oracle.com
jane.chu at oracle.com
Wed Apr 23 22:35:06 PDT 2025
On 4/23/2025 4:28 PM, Jason Gunthorpe wrote:
>> The flow of a single test run:
>> 1. reserve virtual address space for (61440 * 2MB) via mmap with PROT_NONE
>> and MAP_ANONYMOUS | MAP_NORESERVE| MAP_PRIVATE
>> 2. mmap ((61440 * 2MB) / 12) from each of the 12 device-dax to the
>> reserved virtual address space sequentially to form a continual VA
>> space
> Like is there any chance that each of these 61440 VMA's is a single
> 2MB folio from device-dax, or could it be?
>
> IIRC device-dax does could not use folios until 6.15 so I'm assuming
> it is not folios even if it is a pmd mapping?
>
I just ran the mr registration stress test in 6.15-rc3, much better!
What's changed? is it folio for device-dax? none of the code in
ib_umem_get() has changed though, it still loops through 'npages' doing
pinned = pin_user_pages_fast(cur_base,
min_t(unsigned long, npages, PAGE_SIZE / sizeof(struct page *)),
gup_flags, page_list);
ret = sg_alloc_append_table_from_pages(&umem->sgt_append, page_list,
pinned, 0,
pinned << PAGE_SHIFT, ib_dma_max_seg_size(device), npages,
GFP_KERNEL);
for up to 64 4K-pages at a time, and zone_device_pages_have_same_pgmap()
is expected to be called for each 4K page, showing no awareness of large
folio.
thanks,
-jane
More information about the Linux-nvme
mailing list