[PATCH v2 4/8] riscv: mm: Add memory hotplugging support
David Hildenbrand
david at redhat.com
Tue May 14 09:04:44 PDT 2024
On 14.05.24 16:04, Björn Töpel wrote:
> From: Björn Töpel <bjorn at rivosinc.com>
>
> For an architecture to support memory hotplugging, a couple of
> callbacks needs to be implemented:
>
> arch_add_memory()
> This callback is responsible for adding the physical memory into the
> direct map, and call into the memory hotplugging generic code via
> __add_pages() that adds the corresponding struct page entries, and
> updates the vmemmap mapping.
>
> arch_remove_memory()
> This is the inverse of the callback above.
>
> vmemmap_free()
> This function tears down the vmemmap mappings (if
> CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
> backing vmemmap pages. Note that for persistent memory, an
> alternative allocator for the backing pages can be used; The
> vmem_altmap. This means that when the backing pages are cleared,
> extra care is needed so that the correct deallocation method is
> used.
>
> arch_get_mappable_range()
> This functions returns the PA range that the direct map can map.
> Used by the MHP internals for sanity checks.
>
> The page table unmap/teardown functions are heavily based on code from
> the x86 tree. The same remove_pgd_mapping() function is used in both
> vmemmap_free() and arch_remove_memory(), but in the latter function
> the backing pages are not removed.
>
> Signed-off-by: Björn Töpel <bjorn at rivosinc.com>
> ---
> arch/riscv/mm/init.c | 242 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 242 insertions(+)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 6f72b0b2b854..7f0b921a3d3a 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -1493,3 +1493,245 @@ void __init pgtable_cache_init(void)
> }
> }
> #endif
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
> +{
> + pte_t *pte;
> + int i;
> +
> + for (i = 0; i < PTRS_PER_PTE; i++) {
> + pte = pte_start + i;
> + if (!pte_none(*pte))
> + return;
> + }
> +
> + free_pages((unsigned long)page_address(pmd_page(*pmd)), 0);
> + pmd_clear(pmd);
> +}
> +
> +static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)
> +{
> + pmd_t *pmd;
> + int i;
> +
> + for (i = 0; i < PTRS_PER_PMD; i++) {
> + pmd = pmd_start + i;
> + if (!pmd_none(*pmd))
> + return;
> + }
> +
> + free_pages((unsigned long)page_address(pud_page(*pud)), 0);
> + pud_clear(pud);
> +}
> +
> +static void __meminit free_pud_table(pud_t *pud_start, p4d_t *p4d)
> +{
> + pud_t *pud;
> + int i;
> +
> + for (i = 0; i < PTRS_PER_PUD; i++) {
> + pud = pud_start + i;
> + if (!pud_none(*pud))
> + return;
> + }
> +
> + free_pages((unsigned long)page_address(p4d_page(*p4d)), 0);
> + p4d_clear(p4d);
> +}
> +
> +static void __meminit free_vmemmap_storage(struct page *page, size_t size,
> + struct vmem_altmap *altmap)
> +{
> + if (altmap)
> + vmem_altmap_free(altmap, size >> PAGE_SHIFT);
> + else
> + free_pages((unsigned long)page_address(page), get_order(size));
If you unplug a DIMM that was added during boot (can happen on x86-64,
can it happen on riscv?), free_pages() would not be sufficient. You'd be
freeing a PG_reserved page that has to be freed differently.
--
Cheers,
David / dhildenb
More information about the linux-riscv
mailing list