[PATCH v3 07/11] arm64: fixmap: allow init before linear mapping is set up
Mark Rutland
mark.rutland at arm.com
Tue Apr 14 03:47:12 PDT 2015
Hi Ard,
On Fri, Apr 10, 2015 at 02:53:51PM +0100, Ard Biesheuvel wrote:
> This reworks early_ioremap_init() so it populates the various levels
> of translation tables while taking the following into account:
> - be prepared for any of the levels to have been populated already, as
> this may be the case once we move the kernel text mapping out of the
> linear mapping;
> - don't rely on __va() to translate the physical address in a page table
> entry to a virtual address, since this produces linear mapping addresses;
> instead, use the fact that at any level, we know exactly which page in
> swapper_pg_dir an entry could be pointing to if it points anywhere.
Can we not use Catalin's PHYS_OFFSET swizzling trick instead? i.e.
* Set PHYS_OFFSET so __va hits in the text mapping.
* Create the fixmap entries.
* Parse the DTB or UEFI memory map, figure out the real PHYS_OFFSET.
* Create linear mapping for the initial tables.
* Swap PHYS_OFFSET for the real version, and update init_mm->pgd to
point at the linear map alias of the swapper pgd.
It seemed like that would require less open-coding of table manipulation
code, as we could use __va() early.
Is there a limitation with that approach that I'm missing?
Otherwise, comments below.
> This allows us to defer the initialization of the linear mapping until
> after we have figured out where our RAM resides in the physical address
> space.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
> ---
> arch/arm64/include/asm/compiler.h | 2 +
> arch/arm64/kernel/vmlinux.lds.S | 14 +++--
> arch/arm64/mm/mmu.c | 117 +++++++++++++++++++++++++-------------
> 3 files changed, 90 insertions(+), 43 deletions(-)
>
> diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
> index ee35fd0f2236..dd342af63673 100644
> --- a/arch/arm64/include/asm/compiler.h
> +++ b/arch/arm64/include/asm/compiler.h
> @@ -27,4 +27,6 @@
> */
> #define __asmeq(x, y) ".ifnc " x "," y " ; .err ; .endif\n\t"
>
> +#define __pgdir __attribute__((section(".pgdir"),aligned(PAGE_SIZE)))
> +
> #endif /* __ASM_COMPILER_H */
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 98073332e2d0..604f285d3832 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -160,11 +160,15 @@ SECTIONS
>
> BSS_SECTION(0, 0, 0)
>
> - . = ALIGN(PAGE_SIZE);
> - idmap_pg_dir = .;
> - . += IDMAP_DIR_SIZE;
> - swapper_pg_dir = .;
> - . += SWAPPER_DIR_SIZE;
> + .pgdir (NOLOAD) : {
> + . = ALIGN(PAGE_SIZE);
> + idmap_pg_dir = .;
> + . += IDMAP_DIR_SIZE;
> + swapper_pg_dir = .;
> + __swapper_bs_region = . + PAGE_SIZE;
> + . += SWAPPER_DIR_SIZE;
> + *(.pgdir)
> + }
>
> _end = .;
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 60be58a160a2..c0427b5c90c7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -341,6 +341,70 @@ static void __init __map_memblock(phys_addr_t start, phys_addr_t end)
> }
> #endif
>
> +struct mem_bootstrap_region {
The "region" naming is a little confusing IMO. To me it sounds like
something akin to a VMA rather than a set of tables.
> +#if CONFIG_ARM64_PGTABLE_LEVELS > 3
> + pud_t pud[PTRS_PER_PUD];
> +#endif
> +#if CONFIG_ARM64_PGTABLE_LEVELS > 2
> + pmd_t pmd[PTRS_PER_PMD];
> +#endif
> + pte_t pte[PTRS_PER_PTE];
> +};
> +
> +static void __init bootstrap_mem_region(unsigned long addr,
> + struct mem_bootstrap_region *reg,
> + pmd_t **ppmd, pte_t **ppte)
> +{
> + /*
> + * Avoid using the linear phys-to-virt translation __va() so that we
> + * can use this code before the linear mapping is set up. Note that
> + * any populated entries at any level can only point into swapper_pg_dir
> + * since no other translation table pages have been allocated yet.
> + * So at each level, we either need to populate it, or it has already
> + * been populated by a swapper_pg_dir table at the same level, in which
> + * case we can figure out its virtual address without applying __va()
> + * on the contents of the entry, using the following struct.
> + */
> + extern struct mem_bootstrap_region __swapper_bs_region;
> +
> + pgd_t *pgd = pgd_offset_k(addr);
> + pud_t *pud = (pud_t *)pgd;
> + pmd_t *pmd = (pmd_t *)pud;
> +
> +#if CONFIG_ARM64_PGTABLE_LEVELS > 3
> + if (pgd_none(*pgd)) {
> + clear_page(reg->pud);
> + pgd_populate(&init_mm, pgd, reg->pud);
What's PHYS_OFFSET expected to be at this point (for the purposes of
__pa() in the *_populate*() macros)?
If we're relying on __pa() to convert text->phys, won't __va() convert
phys->text at this point?
> + pud = reg->pud;
> + } else {
> + pud = __swapper_bs_region.pud;
> + }
> + pud += pud_index(addr);
> +#endif
Can we free the unused reg tables in the else cases? If __pa() works we
should be able to hand them to memblock, no?
[...]
> /*
> * The boot-ioremap range spans multiple pmds, for which
> - * we are not preparted:
> + * we are not prepared:
This typo has been bugging me for ages. I'll be glad to see it go!
Mark.
More information about the linux-arm-kernel
mailing list