[PATCH v4] ARM: LPAE: Fix mapping in alloc_init_pte for unaligned addresses
Catalin Marinas
catalin.marinas at arm.com
Fri Feb 1 11:37:22 EST 2013
On Fri, Feb 01, 2013 at 04:32:54PM +0000, Christoffer Dall wrote:
> On Fri, Feb 1, 2013 at 11:26 AM, Catalin Marinas
> <catalin.marinas at arm.com> wrote:
> > On Tue, Jan 29, 2013 at 03:07:16PM +0000, R Sricharan wrote:
> >> With LPAE enabled, alloc_init_section() does not map the
> >> entire address space for unaligned addresses.
> >>
> >> The issue also reproduced with CMA + LPAE. CMA tries to map 16MB
> >> with page granularity mappings during boot. alloc_init_pte()
> >> is called and out of 16MB, only 2MB gets mapped and rest remains
> >> unaccessible.
> >>
> >> Because of this OMAP5 boot is broken with CMA + LPAE enabled.
> >> Fix the issue by ensuring that the entire addresses are
> >> mapped.
> >>
> >> Signed-off-by: R Sricharan <r.sricharan at ti.com>
> >> Cc: Catalin Marinas <catalin.marinas at arm.com>
> >> Cc: Christoffer Dall <chris at cloudcar.com>
> >> Cc: Russell King <linux at arm.linux.org.uk>
> >> Acked-by: Santosh Shilimkar <santosh.shilimkar at ti.com>
> >> Tested-by: Christoffer Dall <chris at cloudcar.com>
> >> ---
> >> [V2] Moved the loop to alloc_init_pte as per Russell's
> >> feedback and changed the subject accordingly.
> >> Using PMD_XXX instead of SECTION_XXX to avoid
> >> different loop increments with/without LPAE.
> >>
> >> [v3] Removed the dummy variable phys and updated
> >> the commit log for CMA case.
> >>
> >> [v4] Resending with updated change log and
> >> updating the tags.
> >>
> >> arch/arm/mm/mmu.c | 20 ++++++++++++++++----
> >> 1 file changed, 16 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> >> index f8388ad..b94c313 100644
> >> --- a/arch/arm/mm/mmu.c
> >> +++ b/arch/arm/mm/mmu.c
> >> @@ -569,11 +569,23 @@ static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
> >> unsigned long end, unsigned long pfn,
> >> const struct mem_type *type)
> >> {
> >> - pte_t *pte = early_pte_alloc(pmd, addr, type->prot_l1);
> >> + unsigned long next;
> >> + pte_t *pte;
> >> +
> >> do {
> >> - set_pte_ext(pte, pfn_pte(pfn, __pgprot(type->prot_pte)), 0);
> >> - pfn++;
> >> - } while (pte++, addr += PAGE_SIZE, addr != end);
> >> + if ((end-addr) & PMD_MASK)
> >> + next = (addr + PMD_SIZE) & PMD_MASK;
> >> + else
> >> + next = end;
> >
> > Can use pmd_addr_end(addr, end) here?
> >
> >> + pte = early_pte_alloc(pmd, addr, type->prot_l1);
> >> + do {
> >> + set_pte_ext(pte, pfn_pte(pfn,
> >> + __pgprot(type->prot_pte)), 0);
> >> + pfn++;
> >> + } while (pte++, addr += PAGE_SIZE, addr != next);
> >> +
> >> + } while (pmd++, addr = next, addr != end);
> >
> > I would actually keep the loop in alloc_init_section(). There is even a
> > comment in there saying "no need to loop" but you actually moved the
> > loop in alloc_init_pte().
> >
> > I'm proposing a simpler patch below (only lightly tested on VE/C-A9).
> > The only difference is that we do more flush_pmd_entry() calls but I'm
> > not really bothered, it's during boot and you won't notice.
> >
> >
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index 9c82f98..eaa8ba8 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -205,6 +205,11 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> >
> > #define pte_present_user(pte) (pte_present(pte) && (pte_val(pte) & L_PTE_USER))
> >
> > +#define section_addr_end(addr, end) \
> > +({ unsigned long __boundary = ((addr) + SECTION_SIZE) & SECTION_MASK; \
> > + (__boundary - 1 < (end) - 1)? __boundary: (end); \
> > +})
> > +
> > #if __LINUX_ARM_ARCH__ < 6
> > static inline void __sync_icache_dcache(pte_t pteval)
> > {
> > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > index 9f06102..0d0faed 100644
> > --- a/arch/arm/mm/mmu.c
> > +++ b/arch/arm/mm/mmu.c
> > @@ -581,34 +581,19 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
> > const struct mem_type *type)
> > {
> > pmd_t *pmd = pmd_offset(pud, addr);
> > + unsigned long next;
> >
> > - /*
> > - * Try a section mapping - end, addr and phys must all be aligned
> > - * to a section boundary. Note that PMDs refer to the individual
> > - * L1 entries, whereas PGDs refer to a group of L1 entries making
> > - * up one logical pointer to an L2 table.
> > - */
> > - if (type->prot_sect && ((addr | end | phys) & ~SECTION_MASK) == 0) {
> > - pmd_t *p = pmd;
> > -
> > -#ifndef CONFIG_ARM_LPAE
> > - if (addr & SECTION_SIZE)
> > - pmd++;
> > -#endif
> > -
> > - do {
> > + do {
> > + next = section_addr_end(addr, end);
> > + /* try section mapping first */
> > + if (((addr | next | phys) & ~SECTION_MASK) == 0) {
> > *pmd = __pmd(phys | type->prot_sect);
> > - phys += SECTION_SIZE;
> > - } while (pmd++, addr += SECTION_SIZE, addr != end);
> > -
> > - flush_pmd_entry(p);
> > - } else {
> > - /*
> > - * No need to loop; pte's aren't interested in the
> > - * individual L1 entries.
> > - */
> > - alloc_init_pte(pmd, addr, end, __phys_to_pfn(phys), type);
> > - }
> > + flush_pmd_entry(pmd);
> > + } else {
> > + alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys), type);
>
> aren't you wasting memory here? The pte doesn't alloc a full page, but
> the memblock allocator allocates a full page right?
>
> I thought this was the rationale behind Russell's previous comments on
> Santosh's earlier patch version.
You are right, it's allocating more ptes. Than we can use pmd_addr_end.
I'll go back to the code.
--
Catalin
More information about the linux-arm-kernel
mailing list