[RESEND RFC PATCH 0/2] Enable vmalloc huge mappings by default on arm64
Uladzislau Rezki
urezki at gmail.com
Mon Jan 12 02:49:50 PST 2026
On Fri, Dec 12, 2025 at 09:56:59AM +0530, Dev Jain wrote:
> In the quest for reducing TLB pressure via block mappings, enable huge
> vmalloc by default on arm64 for BBML2-noabort systems which support kernel
> live mapping split.
>
> This series is an RFC, because I cannot get a performance improvement for
> the usual benchmarks which we have. Currently, vmalloc follows an opt-in
> approach for block mappings - the users calling vmalloc_huge() are the ones
> which expect the most advantage from block mappings. Most users of
> vmalloc(), kvmalloc() and kvzalloc() map a single page. After applying this
> series, it is expected that a considerable number of users will produce
> cont mappings, and probably none will produce PMD mappings.
>
> I am asking for help from the community in testing - I believe that one of
> the testing methods is xfstests: a lot of code uses the APIs mentioned
> above. I am hoping that someone can jump in and run at least xfstests, and
> probably some other tests which can take advantage of the reduced TLB
> pressure from vmalloc cont mappings.
>
I checked how often vmalloc/vmap is triggered when i run xfstests. I think
it also depends on env. and can be different from one setup to another.
"echo vmalloc:alloc_vmap_area > set_event"
urezki at milan:~/data/optane/xfs-test/xfstests.git$ wc -l ./vmalloc_traces/*.trace
2875 ./vmalloc_traces/generic_036.trace
30117 ./vmalloc_traces/generic_038.trace
8481 ./vmalloc_traces/generic_051.trace
16986 ./vmalloc_traces/generic_055.trace
6079 ./vmalloc_traces/generic_068.trace
2792 ./vmalloc_traces/generic_070.trace
26945 ./vmalloc_traces/generic_072.trace
2772 ./vmalloc_traces/generic_076.trace
2750 ./vmalloc_traces/generic_083.trace
3319 ./vmalloc_traces/generic_095.trace
2855 ./vmalloc_traces/generic_232.trace
3537 ./vmalloc_traces/generic_269.trace
21265 ./vmalloc_traces/generic_299.trace
3231 ./vmalloc_traces/generic_300.trace
3050 ./vmalloc_traces/generic_323.trace
2831 ./vmalloc_traces/generic_390.trace
4296 ./vmalloc_traces/generic_461.trace
4807 ./vmalloc_traces/generic_476.trace
3198 ./vmalloc_traces/generic_551.trace
3096 ./vmalloc_traces/generic_616.trace
6495 ./vmalloc_traces/generic_627.trace
11232 ./vmalloc_traces/generic_642.trace
11706 ./vmalloc_traces/generic_650.trace
3135 ./vmalloc_traces/generic_750.trace
5926 ./vmalloc_traces/generic_751.trace
77623 ./vmalloc_traces/xfs_013.trace
9172 ./vmalloc_traces/xfs_017.trace
4145 ./vmalloc_traces/xfs_068.trace
2982 ./vmalloc_traces/xfs_104.trace
7293 ./vmalloc_traces/xfs_167.trace
18851 ./vmalloc_traces/xfs_168.trace
4373 ./vmalloc_traces/xfs_442.trace
3550 ./vmalloc_traces/xfs_609.trace
321765 total
urezki at milan:~/data/optane/xfs-test/xfstests.git$
Time execution is different for each test. For example "xfs_013" test
takes around 200 seconds on my system and is in top of number of calls:
77623 / 200 = 388.115 calls/sec
200 / 77623 = 0.002576 = ~each 2.5ms
Please note, i have not checked impact of your patch on time execution
or how TLB pressure is affected.
--
Uladzislau Rezki
More information about the linux-arm-kernel
mailing list