[RFC] arm64: mm: update max_pfn after memory hotplug

David Hildenbrand david at redhat.com
Fri Sep 24 01:17:46 PDT 2021


On 24.09.21 04:47, Florian Fainelli wrote:
> 
> 
> On 9/23/2021 3:54 PM, Chris Goldsworthy wrote:
>> From: Sudarshan Rajagopalan <quic_sudaraja at quicinc.com>
>>
>> After new memory blocks have been hotplugged, max_pfn and max_low_pfn
>> needs updating to reflect on new PFNs being hot added to system.
>>
>> Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja at quicinc.com>
>> Signed-off-by: Chris Goldsworthy <quic_cgoldswo at quicinc.com>
>> ---
>>    arch/arm64/mm/mmu.c | 5 +++++
>>    1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index cfd9deb..fd85b51 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1499,6 +1499,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>    	if (ret)
>>    		__remove_pgd_mapping(swapper_pg_dir,
>>    				     __phys_to_virt(start), size);
>> +	else {
>> +		max_pfn = PFN_UP(start + size);
>> +		max_low_pfn = max_pfn;
>> +	}
> 
> This is a drive by review, but it got me thinking about your changes a bit:
> 
> - if you raise max_pfn when you hotplug memory, don't you need to lower
> it when you hot unplug memory as well?

The issue with lowering is that you actually have to do some search to 
figure out the actual value -- and it's not really worth the trouble. 
Raising the limit is easy.

With memory hotunplug, anybody wanting to take a look at a "struct page" 
via a pfn has to do a pfn_to_online_page() either way. That will fail if 
there isn't actually a memmap anymore because the memory has been 
unplugged. So "max_pfn" is actually rather a hint what maximum pfn to 
look at, and it can be bigger than it actually is.

The a look at the example usage in fs/proc/page.c:kpageflags_read()

pfn_to_online_page() will simply fail and stable_page_flags() will 
indicate a KPF_NOPAGE.

Just like we would have a big memory hole now at the end of memory.

> 
> - suppose that you have a platform which maps physical memory into the
> CPU's address space at 0x00_4000_0000 (1GB offset) and the kernel boots
> with 2GB of DRAM plugged by default. At that point we have not
> registered a swiotlb because we have less than 4GB of addressable
> physical memory, there is no IOMMU in that system, it's a happy world.
> Now assume that we plug an additional 2GB of DRAM into that system
> adjacent to the previous 2GB, from 0x00_C0000_0000 through
> 0x14_0000_0000, now we have physical addresses above 4GB, but we still
> don't have a swiotlb, some of our DMA_BIT_MASK(32) peripherals are going
> to be unable to DMA from that hot plugged memory, but they could if we
> had a swiotlb.

That's why platforms that hotplug memory should indicate the maximum 
possible PFN via some mechanism during boot. On x86-64 (and IIRC also 
arm64 now), this is done via the ACPI SRAT.

And that's where "max_possible_pfn" and "max_pfn" differ. See 
drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init():

	max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));$


Using max_possible_pfn, the OS can properly setup the swiotlb, even 
thought it wouldn't currently be required when just looking at max_pfn.

I documented that for virtio-mem in
	https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html
"swiotlb and DMA memory".

> 
> - now let's go even further but this is very contrived. Assume that the
> firmware has somewhat created a reserved memory region with a 'no-map'
> attribute thus indicating it does not want a struct page to be created
> for a specific PFN range, is it valid to "blindly" raise max_pfn if that
> region were to be at the end of the just hot-plugged memory?

no-map means that no direct mapping is to be created, right? We would 
still have a memmap IIRC, and the pages are PG_reserved.

Again, I think this is very similar to just having no-map regions like 
random memory holes within the existing memory layout.


What Chris proposes here is very similar to 
arch/x86/mm/init_64.c:update_end_of_memory_vars() called during 
arch_add_memory()->add_pages() on x86-64.

-- 
Thanks,

David / dhildenb




More information about the linux-arm-kernel mailing list