How to use NAPOT-based huge pages(64K) together with PMD(2MB) via Transparent Huge Pages (THP) in GLIBC’s malloc()?
Nikita Koschin
nikita.koshin at cloudbear.ru
Tue Dec 2 08:02:56 PST 2025
Hello everyone,
We are trying to use the Svnapot extension in our Linux (version 6.15)
tests, but we have encountered several issues.
Many programs use malloc() for memory allocation, and GLIBC (version
2.42) typically implements it via mmap(). However, by default, this does
not use huge pages.
We tried enabling huge-page-backed malloc() using the tunable:
GLIBC_TUNABLES=glibc.malloc.hugetlb=1 ./test.elf
This enables Transparent Huge Pages (THP), but, as seen in GLIBC
sources[1], the internal __malloc_default_thp_pagesize() function reads
only "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", which always
reports the PMD-sized THP (2 MiB on most platforms). It does not
consider smaller THP sizes (e.g., 64 KiB), even if they are enabled and
usable.
When using explicit huge pages:
GLIBC_TUNABLES=glibc.malloc.hugetlb=2 ...
GLIBC checks “Hugepagesize:” from "/proc/meminfo", and allocation works
— but only with the system’s default huge page size. For example:
- If Hugepagesize: 2048 kB, then allocations in the range [2112 KiB,
3008 KiB] use:
- one 2 MiB huge page + remainder in 4 KiB pages (no 64 KiB),
- If Hugepagesize: 64 kB, then only 64 KiB pages are used — even for
multi-megabyte requests.
This behaviour is rigid and prevents hybrid allocations (e.g., 1 × 2 MiB
+ several × 64 KiB pages in a single malloc block).
Question:
Is there any supported way to let GLIBC’s malloc() use both 2 MiB and 64
KiB THP sizes within a single allocation, without relying on
glibc.malloc.hugetlb=1 or 2 (which ties us to the global default huge
page size and setting nr_hugepages)?
[1]
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/malloc-hugepages.c;h=e23cdfb6b72ff31c5c449e378181b674827b7c21;hb=d2097651cc57834dbfcaa102ddfacae0d86cfb66
[2] https://docs.kernel.org/admin-guide/mm/transhuge.html#khugepaged-control
Examples:
NOTE: test.elf below uses malloc() with sizes that are multiples of 64
KiB, but GLIBC’s internal allocator metadata causes the actual committed
size to be larger.
#################### trying with 2MB and 64k THP
##### Linux docs[2] says khugepaged for madvise() syscall search only
2MB(PMD) pages. It's just example.
~$ echo always > /sys/kernel/mm/transparent_hugepage/defrag
~$ echo always > /sys/kernel/mm/transparent_hugepage/enabled
~$ echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
~$ cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
always [inherit] madvise never
####### trying use without tunnable
~$ /test.elf
Check for 2816kB...
smaps:
3fb9da9000-3fba06a000 rw-s 0 00:00 0
Size: 2820 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 2820 kB
Pss: 2820 kB
PssDirty: 2820 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 2820 kB
Referenced: 2820 kB
Anonymous: 2820 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 2048 kB ---- allocate PMD as expected
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
SharedHugetlb: 0 kB
PrivateHugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 1
VmFlags: rd wr mr mw me ac
Allocated 2883584 B.
####### trying use with THP tunnable
~$ GLIBC_TUNABLES=glibc.malloc.hugetlb=1 /test.elf
Check for 2816kB...
3fb6be4000-3fb6ea5000 rw-s 0 00:00 0
Size: 2820 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 2820 kB
Pss: 2820 kB
PssDirty: 2820 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 2820 kB
Referenced: 2820 kB
Anonymous: 2820 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 2048 kB ----- same situation
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
SharedHugetlb: 0 kB
PrivateHugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 1
VmFlags: rd wr mr mw me ac
Allocated 2883584 B.
#################### setting tunnable to 1 with 64k THP only
~$ echo never > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
~$ GLIBC_TUNABLES=glibc.malloc.hugetlb=1 /test.elf
/proc/self/smaps:
.......
3f89ec1000-3f8a182000 rw-s 0 00:00 0
Size: 2820 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 2820 kB
Pss: 2820 kB
PssDirty: 2820 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 2820 kB
Referenced: 2820 kB
Anonymous: 2820 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 0 kB ----------- nothing !
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
SharedHugetlb: 0 kB
PrivateHugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 1
..................
#################### setting tunnable to 2 with 2MB and 64k common
hugepages
~ # echo 256 > /sys/kernel/mm/hugepages/hugepages-64kB/nr_hugepages
~ # echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
~ # cat /proc/meminfo
MemTotal: 4018064 kB
MemFree: 3002224 kB
MemAvailable: 2980220 kB
...............
Active: 6848 kB
Inactive: 425832 kB
Active(anon): 6848 kB
Inactive(anon): 425832 kB
...............
AnonPages: 4348 kB
Mapped: 5796 kB
Shmem: 428416 kB
KReclaimable: 3724 kB
Slab: 21296 kB
SReclaimable: 3724 kB
SUnreclaim: 17572 kB
KernelStack: 1196 kB
PageTables: 520 kB
CommitLimit: 1738696 kB
Committed_AS: 434492 kB
VmallocTotal: 67108864 kB
VmallocUsed: 1516 kB
VmallocChunk: 0 kB
Percpu: 144 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
Balloon: 0 kB
HugePages_Total: 256
HugePages_Free: 256
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB ------------------ can chage with
default_hugepagesz bootarg, but it's not smart
Hugetlb: 540672 kB
~$ GLIBC_TUNABLES=glibc.malloc.hugetlb=2 /test.elf
Check for 2816kB...
3f98600000-3f98a00000 rw-s 0 00:0f 1843 /anon_hugepage (deleted)
Size: 4096 kB ------- requested size smaller than it
KernelPageSize: 2048 kB
MMUPageSize: 2048 kB
Rss: 0 kB
Pss: 0 kB
PssDirty: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 0 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
SharedHugetlb: 0 kB
PrivateHugetlb: 4096 kB -------- 2MB hugepages again
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
VmFlags: rd wr mr mw me de ht
Allocated 2883584 B.
#################### setting tunnable to 64K manually:
~$ GLIBC_TUNABLES=glibc.malloc.hugetlb=0x10000 /test.elf
Check for 2816kB...
3f95da0000-3f96070000 rw-s 0 00:10 1847 /anon_hugepage (deleted)
Size: 2880 kB
KernelPageSize: 64 kB
MMUPageSize: 64 kB
Rss: 0 kB
Pss: 0 kB
PssDirty: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 0 kB
KSM: 0 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
SharedHugetlb: 0 kB
PrivateHugetlb: 2880 kB ----- only 64K hugepages
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
VmFlags: rd wr mr mw me de ht
Allocated 2883584 B.
More information about the linux-riscv
mailing list