[PATCH net-next 04/12] mm: Make the page_frag_cache allocator use multipage folios
Mika Penttilä
mpenttil at redhat.com
Fri May 26 07:06:55 PDT 2023
Hi,
On 26.5.2023 15.47, David Howells wrote:
> Yunsheng Lin <linyunsheng at huawei.com> wrote:
>
>>> Change the page_frag_cache allocator to use multipage folios rather than
>>> groups of pages. This reduces page_frag_free to just a folio_put() or
>>> put_page().
>>
>> put_page() is not used in this patch, perhaps remove it to avoid
>> the confusion?
>
> Will do if I need to respin the patches.
>
>> Also, Is there any significant difference between __free_pages()
>> and folio_put()? IOW, what does the 'reduces' part means here?
>
> I meant that the folio code handles page compounding for us and we don't need
> to work out how big the page is for ourselves.
>
> If you look at __free_pages(), you can see a PageHead() call. folio_put()
> doesn't need that.
>
>> I followed some disscusion about folio before, but have not really
>> understood about real difference between 'multipage folios' and
>> 'groups of pages' yet. Is folio mostly used to avoid the confusion
>> about whether a page is 'headpage of compound page', 'base page' or
>> 'tailpage of compound page'? Or is there any abvious benefit about
>> folio that I missed?
>
> There is a benefit: a folio pointer always points to the head page and so we
> never need to do "is this compound? where's the head?" logic to find it. When
> going from a page pointer, we still have to find the head.
>
But page_frag_free() uses folio_put(virt_to_folio(addr)) and
virt_to_folio() depends on the compound infrastructure to get the head
page and folio.
> Ultimately, the aim is to reduce struct page to a typed pointer to massively
> reduce the amount of space consumed by mem_map[]. A page struct will then
> point at a folio or a slab struct or one of a number of different types. But
> to get to that point, we have to stop a whole lot of things from using page
> structs, but rather use some other type, such as folio.
>
> Eventually, there won't be a need for head pages and tail pages per se - just
> memory objects of different sizes.
>
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index 306a3d1a0fa6..d7c52a5979cc 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio)
>>> }
>>>
>>> struct page_frag_cache {
>>> - void * va;
>>> -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
>>> - __u16 offset;
>>> - __u16 size;
>>> -#else
>>> - __u32 offset;
>>> -#endif
>>> + struct folio *folio;
>>> + unsigned int offset;
>>> /* we maintain a pagecount bias, so that we dont dirty cache line
>>> * containing page->_refcount every time we allocate a fragment.
>>> */
>>> - unsigned int pagecnt_bias;
>>> - bool pfmemalloc;
>>> + unsigned int pagecnt_bias;
>>> + bool pfmemalloc;
>>> };
>>
>> It seems 'va' and 'size' field is used to avoid touching 'stuct page' to
>> avoid possible cache bouncing when there is more frag can be allocated
>> from the page while other frags is freed at the same time before this patch?
>
> Hmmm... fair point, though va is calculated from the page pointer on most
> arches without the need to dereference struct page (only arc, m68k and sparc
> define WANT_PAGE_VIRTUAL).
>
> David
>
--Mika
More information about the Linux-nvme
mailing list