[LSF/MM/BPF TOPIC] Memory fragmentation with large block sizes
Hannes Reinecke
hare at suse.de
Thu Feb 19 23:44:02 PST 2026
On 2/19/26 15:32, Theodore Tso wrote:
> On Thu, Feb 19, 2026 at 10:54:48AM +0100, Hannes Reinecke wrote:
>> Hi all,
>>
>> I (together with the Czech Technical University) did some experiments trying
>> to measure memory fragmentation with large block sizes.
>> Testbed used was an nvme setup talking to a nvmet storage over
>> the network.
>>
>> Doing so raised some challenges:
>>
>> - How do you _generate_ memory fragmentation? The MM subsystem is
>> precisely geared up to avoid it, so you would need to come up
>> with some idea how to defeat it. With the help from Willy I managed
>> to come up with something, but I really would like to discuss
>> what would be the best option here.
>
> I'm trying to understand the goal of the experiment. I'm guessing
> that the goal was to see how much memory fragmentation would result
> from using large block sizes with the control being to use, say, 4k
> blocks. Is that correct?
>
The main goal was to figure out if we have increased memory
fragmentation when using LBS.
Clearly, most (internal) allocations still work on page-sized
objects, so one can argue that using LBS might increase fragmentation.
On the other hand, all _filesystem_ objects will be in LBS sizes,
so we won't increase fragmentation if we only allocate in LBS sizes.
So which is it?
> So I guess the question here is what are realstic workloads that
> people would have in real world situations, so we can do the A-B
> experiments to see what using LBS result in?
>
Yes.
>> - What is acceptable memory fragmentation? Are we good enough if the
>> measured fragmentation does not grow during the test runs?
>
> I can think of two possible metrics. The first is whether it results
> in degradation of performance given certain real world workloads.
>
> The second is whether given a particular memory pressure, the memory
> fragmentation results in more jobs getting OOM killed.
>
That would be ideal, but we first need to have a program exerting
memory pressure...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list