Re: [PATCH v4 7/7] mm/mm_init: Use for_each_valid_pfn() in init_unavailable_range()
David Woodhouse
dwmw2 at infradead.org
Fri Apr 25 13:36:21 PDT 2025
On 25 April 2025 21:12:49 BST, David Hildenbrand <david at redhat.com> wrote:
>On 25.04.25 21:08, David Woodhouse wrote:
>> On 25 April 2025 17:17:25 BST, David Hildenbrand <david at redhat.com> wrote:
>>> On 23.04.25 15:33, David Woodhouse wrote:
>>>> From: David Woodhouse <dwmw at amazon.co.uk>
>>>>
>>>> Currently, memmap_init initializes pfn_hole with 0 instead of
>>>> ARCH_PFN_OFFSET. Then init_unavailable_range will start iterating each
>>>> page from the page at address zero to the first available page, but it
>>>> won't do anything for pages below ARCH_PFN_OFFSET because pfn_valid
>>>> won't pass.
>>>>
>>>> If ARCH_PFN_OFFSET is very large (e.g., something like 2^64-2GiB if the
>>>> kernel is used as a library and loaded at a very high address), the
>>>> pointless iteration for pages below ARCH_PFN_OFFSET will take a very
>>>> long time, and the kernel will look stuck at boot time.
>>>>
>>>> Use for_each_valid_pfn() to skip the pointless iterations.
>>>>
>>>> Reported-by: Ruihan Li <lrh2000 at pku.edu.cn>
>>>> Suggested-by: Mike Rapoport <rppt at kernel.org>
>>>> Signed-off-by: David Woodhouse <dwmw at amazon.co.uk>
>>>> Reviewed-by: Mike Rapoport (Microsoft) <rppt at kernel.org>
>>>> Tested-by: Ruihan Li <lrh2000 at pku.edu.cn>
>>>> ---
>>>> mm/mm_init.c | 6 +-----
>>>> 1 file changed, 1 insertion(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>>> index 41884f2155c4..0d1a4546825c 100644
>>>> --- a/mm/mm_init.c
>>>> +++ b/mm/mm_init.c
>>>> @@ -845,11 +845,7 @@ static void __init init_unavailable_range(unsigned long spfn,
>>>> unsigned long pfn;
>>>> u64 pgcnt = 0;
>>>> - for (pfn = spfn; pfn < epfn; pfn++) {
>>>> - if (!pfn_valid(pageblock_start_pfn(pfn))) {
>>>> - pfn = pageblock_end_pfn(pfn) - 1;
>>>> - continue;
>>>> - }
>>>
>>> So, if the first pfn in a pageblock is not valid, we skip the whole pageblock ...
>>>
>>>> + for_each_valid_pfn(pfn, spfn, epfn) {
>>>> __init_single_page(pfn_to_page(pfn), pfn, zone, node);
>>>> __SetPageReserved(pfn_to_page(pfn));
>>>> pgcnt++;
>>>
>>> but here, we would process further pfns inside such a pageblock?
>>>
>>
>> Is it not the case that either *all*, or *none*, of the PFNs within a given pageblock will be valid?
>
>Hmm, good point. I was thinking about sub-sections, but all early sections are fully valid.
>
>(Also, at least on x86, the subsection size should match the pageblock size; might not be the case on other architectures, like arm64 with 64K base pages ...)
>
>>
>> I assumed that was *why* it had that skip, as an attempt at the kind of optimisation that for_each_valid_pfn() now gives us?
>
>But it's interesting in this code that we didn't optimize for "if the first pfn is valid, all the remaining ones are valid". We would still check each PFN.
>
>In any case, trying to figure out why Lorenzo ran into an issue ... if it's nit because of the pageblock, maybe something in for_each_valid_pfn with sparsemem is still shaky.
>
A previous round of the patch series had a less aggressively optimised version of the sparsemem implementation...?
Will see if I can reproduce in the morning. A boot in QEMU worked here before I sent it out.
More information about the linux-arm-kernel
mailing list