[PATCH v6 0/9] variable-order, large folios for anonymous memory

Ryan Roberts ryan.roberts at arm.com
Tue Oct 31 06:12:18 PDT 2023


On 31/10/2023 11:58, David Hildenbrand wrote:
> On 31.10.23 12:50, Ryan Roberts wrote:
>> On 06/10/2023 21:06, David Hildenbrand wrote:
>> [...]
>>>
>>> Change 2: sysfs interface.
>>>
>>> If we call it THP, it shall go under "/sys/kernel/mm/transparent_hugepage/", I
>>> agree.
>>>
>>> What we expose there and how, is TBD. Again, not a friend of "orders" and
>>> bitmaps at all. We can do better if we want to go down that path.
>>>
>>> Maybe we should take a look at hugetlb, and how they added support for multiple
>>> sizes. What *might* make sense could be (depending on which values we actually
>>> support!)
>>>
>>>
>>> /sys/kernel/mm/transparent_hugepage/hugepages-64kB/
>>> /sys/kernel/mm/transparent_hugepage/hugepages-128kB/
>>> /sys/kernel/mm/transparent_hugepage/hugepages-256kB/
>>> /sys/kernel/mm/transparent_hugepage/hugepages-512kB/
>>> /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/
>>> /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/
>>>
>>> Each one would contain an "enabled" and "defrag" file. We want something minimal
>>> first? Start with the "enabled" option.
>>>
>>>
>>> enabled: always [global] madvise never
>>>
>>> Initially, we would set it for PMD-sized THP to "global" and for everything else
>>> to "never".
>>
>> Hi David,
> 
> Hi!
> 
>>
>> I've just started coding this, and it occurs to me that I might need a small
>> clarification here; the existing global "enabled" control is used to drive
>> decisions for both anonymous memory and (non-shmem) file-backed memory. But the
>> proposed new per-size "enabled" is implicitly only controlling anon memory (for
>> now).
> 
> Anon was (way) first, and pagecache later decided to reuse that one as an
> indication whether larger folios are desired.
> 
> For the pagecache, it's just a way to enable/disable it globally. As there is no
> memory waste, nobody currently really cares about the exact sized the pagecache
> is allocating (maybe that will change at some point, maybe not, who knows).

Yup. Its not _just_ about allocation though; its also about collapse
(MADV_COLLAPSE, khugepaged) which is supported for pagecache pages. I can
imagine value in collapsing to various sizes that are beneficial for HW...
anyway that's for another day.

> 
>>
>> 1) Is this potentially confusing for the user? Should we rename the per-size
>> controls to "anon_enabled"? Or is it preferable to jsut keep it vague for now so
>> we can reuse the same control for file-backed memory in future?
> 
> The latter would be my take. Just like we did with the global toggle.

ACK

> 
>>
>> 2) The global control will continue to drive the file-backed memory decision
>> (for now), even when hugepages-2048kB/enabled != "global"; agreed?
> 
> That would be my take; it will allocate other sizes already, so just glue it to
> the global toggle and document for the other toggles that they only control
> anonymous THP for now.

ACK

> 




More information about the linux-arm-kernel mailing list