[RFC] ARM64: 4 level page table translation for 4KB pages

Arnd Bergmann arnd at arndb.de
Mon Mar 31 08:58:42 EDT 2014


On Monday 31 March 2014 13:45:51 Catalin Marinas wrote:
> On Mon, Mar 31, 2014 at 12:31:14PM +0100, Catalin Marinas wrote:
> > On Mon, Mar 31, 2014 at 07:56:53AM +0100, Arnd Bergmann wrote:
> > > On Monday 31 March 2014 12:51:07 Jungseok Lee wrote:
> > > > Current ARM64 kernel cannot support 4KB pages for 40-bit physical address
> > > > space described in [1] due to one major issue + one minor issue.
> > > > 
> > > > Firstly, kernel logical memory map (0xffffffc000000000-0xffffffffffffffff)
> > > > cannot cover DRAM region from 544GB to 1024GB in [1]. Specifically, ARM64
> > > > kernel fails to create mapping for this region in map_mem function
> > > > (arch/arm64/mm/mmu.c) since __phys_to_virt for this region reaches to
> > > > address overflow. I've used 3.14-rc8+Fast Models to validate the statement.
> [...]
> > > a) always use a four-level page table in kernel space, regardless of
> > > whether we do it in user space. We can move the kernel mappings down
> > > in address space either by one 512GB entry to 0xffffff0000000000, or
> > > to match the 64k-page location at 0xfffffc0000000000, or all the way
> > > to to 0xfffc000000000000. In any case, we can have all the dynamic
> > > mappings within one 512GB area and pretend we have a three-level
> > > page table for them, while the rest of DRAM is mapped statically at
> > > early boot time using 512GB large pages.
> > 
> > That's a workaround but we end up with two (or more) kernel pgds - one
> > for vmalloc, ioremap etc. and another (static) one for the kernel linear
> > mapping. So far there isn't any memory mapping carved out but we have to
> > be careful in the future.
> > 
> > However, kernel page table walking would be a bit slower with 4-levels.
> 
> Yet another approach would be to enable 4-levels of page tables (no
> nopud.h) in the kernel with pgd_offset_k doing the right thing for 4
> levels but user space configured to 3-levels only and pgd_offset
> returning 0 while pud_offset does what pgd_offset currently implements
> for 3 levels.

Either I was unclear earlier, or I misunderstand what you are saying
here. How is that different from what I wrote above?

> This solves the page table walk latency for user but not for kernel.
> Anyway, if the hardware memory map is so sparse (a real SoC, not the
> spec), I don't think we have other ways to support it with 3-levels of
> page table for the kernel, unless we hack __virt_to_phys/__phys_to_virt.

Right. I also wonder if the SoCs are configurable with the way they
map the memory, they might just be able to do something smarter than
what the document says and map the >2GB memory contiguously starting at
0x08.8000.0000.

	Arnd



More information about the linux-arm-kernel mailing list