Re: [PATCH v4 7/7] mm/mm_init: Use for_each_valid_pfn() in init_unavailable_range()

David Woodhouse dwmw2 at infradead.org
Fri Apr 25 13:36:21 PDT 2025


On 25 April 2025 21:12:49 BST, David Hildenbrand <david at redhat.com> wrote:
>On 25.04.25 21:08, David Woodhouse wrote:
>> On 25 April 2025 17:17:25 BST, David Hildenbrand <david at redhat.com> wrote:
>>> On 23.04.25 15:33, David Woodhouse wrote:
>>>> From: David Woodhouse <dwmw at amazon.co.uk>
>>>> 
>>>> Currently, memmap_init initializes pfn_hole with 0 instead of
>>>> ARCH_PFN_OFFSET. Then init_unavailable_range will start iterating each
>>>> page from the page at address zero to the first available page, but it
>>>> won't do anything for pages below ARCH_PFN_OFFSET because pfn_valid
>>>> won't pass.
>>>> 
>>>> If ARCH_PFN_OFFSET is very large (e.g., something like 2^64-2GiB if the
>>>> kernel is used as a library and loaded at a very high address), the
>>>> pointless iteration for pages below ARCH_PFN_OFFSET will take a very
>>>> long time, and the kernel will look stuck at boot time.
>>>> 
>>>> Use for_each_valid_pfn() to skip the pointless iterations.
>>>> 
>>>> Reported-by: Ruihan Li <lrh2000 at pku.edu.cn>
>>>> Suggested-by: Mike Rapoport <rppt at kernel.org>
>>>> Signed-off-by: David Woodhouse <dwmw at amazon.co.uk>
>>>> Reviewed-by: Mike Rapoport (Microsoft) <rppt at kernel.org>
>>>> Tested-by: Ruihan Li <lrh2000 at pku.edu.cn>
>>>> ---
>>>>    mm/mm_init.c | 6 +-----
>>>>    1 file changed, 1 insertion(+), 5 deletions(-)
>>>> 
>>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>>> index 41884f2155c4..0d1a4546825c 100644
>>>> --- a/mm/mm_init.c
>>>> +++ b/mm/mm_init.c
>>>> @@ -845,11 +845,7 @@ static void __init init_unavailable_range(unsigned long spfn,
>>>>    	unsigned long pfn;
>>>>    	u64 pgcnt = 0;
>>>>    -	for (pfn = spfn; pfn < epfn; pfn++) {
>>>> -		if (!pfn_valid(pageblock_start_pfn(pfn))) {
>>>> -			pfn = pageblock_end_pfn(pfn) - 1;
>>>> -			continue;
>>>> -		}
>>> 
>>> So, if the first pfn in a pageblock is not valid, we skip the whole pageblock ...
>>> 
>>>> +	for_each_valid_pfn(pfn, spfn, epfn) {
>>>>    		__init_single_page(pfn_to_page(pfn), pfn, zone, node);
>>>>    		__SetPageReserved(pfn_to_page(pfn));
>>>>    		pgcnt++;
>>> 
>>> but here, we would process further pfns inside such a pageblock?
>>> 
>> 
>> Is it not the case that either *all*, or *none*, of the PFNs within a given pageblock will be valid?
>
>Hmm, good point. I was thinking about sub-sections, but all early sections are fully valid.
>
>(Also, at least on x86, the subsection size should match the pageblock size; might not be the case on other architectures, like arm64 with 64K base pages ...)
>
>> 
>> I assumed that was *why* it had that skip, as an attempt at the kind of optimisation that for_each_valid_pfn() now gives us?
>
>But it's interesting in this code that we didn't optimize for "if the first pfn is valid, all the remaining ones are valid". We would still check each PFN.
>
>In any case, trying to figure out why Lorenzo ran into an issue ... if it's nit because of the pageblock, maybe something in for_each_valid_pfn with sparsemem is still shaky.
>

A previous round of the patch series had a less aggressively optimised version of the sparsemem implementation...?

Will see if I can reproduce in the morning. A boot in QEMU worked here before I sent it out.



More information about the linux-arm-kernel mailing list