[PATCH v7] arm64: mm: Populate vmemmap at the page level if not section aligned
Zhenhua Huang
quic_zhenhuah at quicinc.com
Mon Mar 3 01:29:55 PST 2025
On 2025/2/27 1:13, David Hildenbrand wrote:
> Sorry, I somehow missed this mail.
>
No problem. Thanks for your reply.
>>>> Hi David,
>>>>
>>>> I had the same doubt initially.
>>>> After going through the codes, I noticed for vmemmap_populate(), the
>>>> arguments "start" and "end" passed down should already be within one
>>>> section.
>>>> early path:
>>>> for_each_present_section_nr
>>>> __populate_section_memmap
>>>> ..
>>>> vmemmap_populate()
>>>>
>>>> hotplug path:
>>>> __add_pages
>>>> section_activate
>>>> vmemmap_populate()
>>>>
>>>> Therefore.. focusing only on the size seems OK to me, and fall back
>>>> solution below appears unnecessary?
>>>
>>> Ah, in that case it is fine. Might make sense to document/enforce that
>>> somehow for the time being ...
>>
>> Shall I document and WARN_ON if the size exceeds? like:
>> WARN_ON(end - start > PAGES_PER_SECTION * sizeof(struct page))
>
> Probably WARN_ON_ONCE() along with a comment that we should never exceed
> a single memory section.
Got it.
>
>>
>> Since vmemmap_populate() is implemented per architecture, the change
>> should apply for other architectures as well. However I have no setup to
>> test it on...
>> Therefore, May I implement it only for arm64 now ?
>
> Would work for me; better than no warning.
I made one patch several days ago, could you please help review once?
https://lore.kernel.org/linux-mm/20250219084001.1272445-1-quic_zhenhuah@quicinc.com/T/
Is there anything else you would suggest besides preferring WARN_ON_ONCE() ?
>
>> Additionally, from previous discussion, the change is worth
>> backporting(apologies for missing to CC stable kernel in this version).
>> Keeping it for arm64 should simplify for backporting. WDYT?
>
> Jup. Of course, we could add a generic warning in a separate patch.
>
>>
>>>
>>>
>>>>> +/*
>>>>> + * Try to populate PMDs, but fallback to populating base pages when
>>>>> ranges
>>>>> + * would only partially cover a PMD.
>>>>> + */
>>>>> int __meminit vmemmap_populate_hugepages(unsigned long start,
>>>>> unsigned
>>>>> long end,
>>>>> int node, struct
>>>>> vmem_altmap
>>>>> *altmap)
>>>>> {
>>>>> @@ -313,6 +317,9 @@ int __meminit vmemmap_populate_hugepages(unsigned
>>>>> long start, unsigned long end,
>>>>> for (addr = start; addr < end; addr = next) {
>>>>
>>>> This for loop appears to be redundant for arm64 as well, as above
>>>> mentioned, a single call to pmd_addr_end() should suffice.
>>>
>>> Right, that was what was confusing me in the first place.
>>>
>>>>
>>>>> next = pmd_addr_end(addr, end);
>>>>>
>>>>> + if (!IS_ALIGNED(addr, PMD_SIZE) || !IS_ALIGNED(next,
>>>>> PMD_SIZE))
>>>>> + goto fallback;
>>>>> +
>>>>> pgd = vmemmap_pgd_populate(addr, node);
>>>>> if (!pgd)
>>>>> return -ENOMEM;
>>>>> @@ -346,6 +353,7 @@ int __meminit vmemmap_populate_hugepages(unsigned
>>>>> long start, unsigned long end,
>>>>> }
>>>>> } else if (vmemmap_check_pmd(pmd, node, addr,
>>>>> next))
>>>>> continue;
>>>>> +fallback:
>>>>> if (vmemmap_populate_basepages(addr, next, node,
>>>>> altmap))
>>>>> return -ENOMEM;
>>>>
>>>> It seems we have no chance to call populate_basepages here?
>>>
>>>
>>> Can you elaborate?
>>
>> It's invoked within vmemmap_populate_hugepages(), which is called by
>> vmemmap_populate(). This implies that we are always performing a whole
>> section hotplug?
>
> Ah, you meant only in the context of this change, yes. I was confused,
> because there are other reasons why we run into that fallback (failing
> to allocate a PMD).
I observed that this logic was introduced in
2045a3b8911b("mm/sparse-vmemmap: generalise
vmemmap_populate_hugepages()") which moved from arch-dependent
codes(such as vmemmap_set_pmd() in arch/loongarch/mm/init.c or
arch/x86/mm/init_64.c) to common arch-independent code(function
vmemmap_populate_hugepages).
I suspect it might be causing the confusion, as it may not be fully
compatible with arm64. However, it does not seem to cause any bugs :)
>
More information about the linux-arm-kernel
mailing list