[PATCH v3 2/2] arm64: Mark kernel page ranges contiguous

Will Deacon will.deacon at arm.com
Fri Feb 26 09:28:26 PST 2016


On Thu, Feb 25, 2016 at 02:46:54PM -0600, Jeremy Linton wrote:
> On 02/25/2016 10:16 AM, Will Deacon wrote:
> >On Fri, Feb 19, 2016 at 11:46:23AM -0600, Jeremy Linton wrote:
> 
> (trimming)
> 
> >>+static void clear_cont_pte_range(pte_t *pte, unsigned long addr)
> >>+{
> >>+	int i;
> >>+
> >>+	pte -= CONT_RANGE_OFFSET(addr);
> >>+	for (i = 0; i < CONT_PTES; i++) {
> >>+		if (pte_cont(*pte))
> >>+			set_pte(pte, pte_mknoncont(*pte));
> >>+		pte++;
> >>+	}
> >>+	flush_tlb_all();
> >
> >Do you still need this invalidation? I thought the table weren't even
> >live at this point?
> 
> Well it continues to match the calls in alloc_init_p*.

Ok, but if it's not needed (and I don't think it is), then we should
remove the invalidation from there too rather than add more of it.

> I guess the worry is the extra flush that happens at create_mapping_late(),
> if mapping ranges aren't cont aligned? (because the loop won't actually be
> doing any set_pte's)

I'm just concerned with trying to make this code understandable! I doubt
there's a performance argument to be made.

> If this and the alloc_init_p* flushes are to be removed, there should
> probably be a way to detect any cases where the splits are happening after
> the tables have been activated. This might be a little less straightforward
> given efi_create_mapping().

I think that's a separate issue, since splitting a live page table is
dodgy regardless of the flushing. But yes, it would be nice to detect
that case and scream about it.

> >
> >>+}
> >>+
> >>+/*
> >>+ * Given a range of PTEs set the pfn and provided page protection flags
> >>+ */
> >>+static void __populate_init_pte(pte_t *pte, unsigned long addr,
> >>+			       unsigned long end, phys_addr_t phys,
> >>+			       pgprot_t prot)
> >>+{
> >>+	unsigned long pfn = __phys_to_pfn(phys);
> >>+
> >>+	do {
> >>+		/* clear all the bits except the pfn, then apply the prot */
> >>+		set_pte(pte, pfn_pte(pfn, prot));
> >>+		pte++;
> >>+		pfn++;
> >>+		addr += PAGE_SIZE;
> >>+	} while (addr != end);
> >>+}
> >>+
> (trimming)
> >>+
> >>  	do {
> >>-		set_pte(pte, pfn_pte(pfn, prot));
> >>-		pfn++;
> >>-	} while (pte++, addr += PAGE_SIZE, addr != end);
> >>+		next = min(end, (addr + CONT_SIZE) & CONT_MASK);
> >>+		if (((addr | next | phys) & ~CONT_MASK) == 0) {
> >>+			/* a block of CONT_PTES	 */
> >>+			__populate_init_pte(pte, addr, next, phys,
> >>+					    prot | __pgprot(PTE_CONT));
> >>+		} else {
> >>+			/*
> >>+			 * If the range being split is already inside of a
> >>+			 * contiguous range but this PTE isn't going to be
> >>+			 * contiguous, then we want to unmark the adjacent
> >>+			 * ranges, then update the portion of the range we
> >>+			 * are interested in.
> >>+			 */
> >>+			clear_cont_pte_range(pte, addr);
> >>+			__populate_init_pte(pte, addr, next, phys, prot);
> >
> >I don't understand the comment or the code here... the splitting is now
> >done seperately, and I can't think of a scenario where you need to clear
> >the cont hint explicitly for adjacent ptes.
> 
> 	
> My understanding is that splitting is initially running this code path (via
> map_kernel_chunk, then again via create_mapping_late where splits won't
> happen). So, split_pmd() is creating cont ranges. When the ranges aren't
> sufficiently aligned then this is wiping out the cont mapping immediately
> after their creation.

Gotcha, thanks for the explanation (I somehow overlooked the initial
page table creation).

Will



More information about the linux-arm-kernel mailing list