[PATCH 12/12] um: refactor TLB update handling

Benjamin Berg benjamin at sipsolutions.net
Mon Apr 22 00:22:22 PDT 2024


On Mon, 2024-04-22 at 10:51 +0800, Tiwei Bie wrote:
> On 4/18/24 5:23 PM, benjamin at sipsolutions.net wrote:
> > diff --git a/arch/um/include/asm/mmu.h b/arch/um/include/asm/mmu.h
> > index 37eb6e89e79a..bf8da736609c 100644
> > --- a/arch/um/include/asm/mmu.h
> > +++ b/arch/um/include/asm/mmu.h
> > @@ -10,6 +10,10 @@
> >  
> >  typedef struct mm_context {
> >  	struct mm_id id;
> > +
> > +	/* Address range in need of a TLB sync */
> > +	long int sync_tlb_range_from;
> > +	long int sync_tlb_range_to;
> 
> Why not "unsigned long"?

Oops, yes, it should be "unsigned long".

> 
> >  } mm_context_t;
> >  
> >  extern void __switch_mm(struct mm_id * mm_idp);
> > diff --git a/arch/um/include/asm/pgtable.h
> > b/arch/um/include/asm/pgtable.h
> > index e1ece21dbe3f..5bb397b65efb 100644
> > --- a/arch/um/include/asm/pgtable.h
> > +++ b/arch/um/include/asm/pgtable.h
> > @@ -244,6 +244,38 @@ static inline void set_pte(pte_t *pteptr,
> > pte_t pteval)
> >  
> >  #define PFN_PTE_SHIFT		PAGE_SHIFT
> >  
> > +static inline void um_tlb_mark_sync(struct mm_struct *mm, unsigned
> > long start,
> > +				    unsigned long end)
> > +{
> > +	if (!mm->context.sync_tlb_range_to) {
> > +		mm->context.sync_tlb_range_from = start;
> > +		mm->context.sync_tlb_range_to = end;
> > +	} else {
> > +		if (start < mm->context.sync_tlb_range_from)
> > +			mm->context.sync_tlb_range_from = start;
> > +		if (end > mm->context.sync_tlb_range_to)
> > +			mm->context.sync_tlb_range_to = end;
> > +	}
> > +}
> 
> IIUC, in some cases, the range [sync_tlb_range_from, sync_tlb_range_to)
> might become very large when merging non-adjacent ranges? Could that
> be an issue?

I figured it is not a big problem. It will result in scanning the
entire page table once to check whether the NEW_PAGE bit is set on any
PTE. I am assuming that this will happen almost never and scanning the
page table (but not doing syscalls) is reasonably cheap at the end.

> > diff --git a/arch/um/include/asm/tlbflush.h b/arch/um/include/asm/tlbflush.h
> > index d7cf82023b74..62816f6f1c91 100644
> > --- a/arch/um/include/asm/tlbflush.h
> > +++ b/arch/um/include/asm/tlbflush.h
> > @@ -9,24 +9,50 @@
> >  #include <linux/mm.h>
> >  
> >  /*
> > - * TLB flushing:
> > + * In UML, we need to sync the TLB over by using mmap/munmap/mprotect syscalls
> > + * from the process handling the MM (which can be the kernel itself).
> > + *
> > + * To track updates, we can hook into set_ptes and flush_tlb_*. With set_ptes
> > + * we catch all PTE transitions where memory that was unusable becomes usable.
> > + * While with flush_tlb_* we can track any memory that becomes unusable and
> > + * even if a higher layer of the page table was modified.
> > + *
> > + * So, we simply track updates using both methods and mark the memory area to
> > + * be synced later on. The only special case is that flush_tlb_kern_* needs to
> > + * be executed immediately as there is no good synchronization point in that
> > + * case. In contrast, in the set_ptes case we can wait for the next kernel
> > + * segfault before we do the synchornization.
> >   *
> > - *  - flush_tlb() flushes the current mm struct TLBs
> >   *  - flush_tlb_all() flushes all processes TLBs
> >   *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
> >   *  - flush_tlb_page(vma, vmaddr) flushes one page
> > - *  - flush_tlb_kernel_vm() flushes the kernel vm area
> >   *  - flush_tlb_range(vma, start, end) flushes a range of pages
> > + *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
> >   */
> >  
> > +extern int um_tlb_sync(struct mm_struct *mm);
> > +
> >  extern void flush_tlb_all(void);
> >  extern void flush_tlb_mm(struct mm_struct *mm);
> > -extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, 
> > -			    unsigned long end);
> > -extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
> > -extern void flush_tlb_kernel_vm(void);
> > -extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
> > -extern void __flush_tlb_one(unsigned long addr);
> > +
> > +static void flush_tlb_page(struct vm_area_struct *vma, unsigned long address)
> > +{
> > +	um_tlb_mark_sync(vma->vm_mm, address, address + PAGE_SIZE);
> > +}
> > +
> > +static void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
> > +			    unsigned long end)
> > +{
> > +	um_tlb_mark_sync(vma->vm_mm, start, end);
> > +}
> > +
> > +static void flush_tlb_kernel_range(unsigned long start, unsigned long end)
> > +{
> > +	um_tlb_mark_sync(&init_mm, start, end);
> > +
> > +	/* Kernel needs to be synced immediately */
> > +	um_tlb_sync(&init_mm);
> > +}
> 
> Nit: this is a header file, these functions should be defined as inline functions.

Yup, thanks!

> > diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
> > index c137ff6f84dd..232aa7601d5d 100644
> > --- a/arch/um/kernel/tlb.c
> > +++ b/arch/um/kernel/tlb.c
> [...]
> >  
> > -void flush_tlb_kernel_range(unsigned long start, unsigned long
> > end)
> > -{
> > -	flush_tlb_kernel_range_common(start, end);
> > -}
> > -
> > -void flush_tlb_kernel_vm(void)
> > -{
> > -	flush_tlb_kernel_range_common(start_vm, end_vm);
> > -}
> 
> The build breaks with this change, as there is still a call to
> flush_tlb_kernel_vm() in ubd.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/um/drivers/ubd_kern.c?id=fb5d1d389c9e78d68f1f71f926d6251017579f5b#n774

Oh, thanks for the pointer!

I do not see a good reason for that call to even exist. My best theory
right now is that it existed to avoid later pagefaults for new memory
regions (the vmalloc?). So a workaround that is not needed anymore with
this patch.

Benjamin





More information about the linux-um mailing list