[RFC PATCH V3 3/6] arm: mm: implement get_user_pages_fast
Steve Capper
steve.capper at linaro.org
Thu Mar 13 04:24:18 EDT 2014
On 12 March 2014 16:55, Will Deacon <will.deacon at arm.com> wrote:
> On Wed, Mar 12, 2014 at 04:32:00PM +0000, Peter Zijlstra wrote:
>> On Wed, Mar 12, 2014 at 01:40:20PM +0000, Steve Capper wrote:
>> > +void pmdp_splitting_flush(struct vm_area_struct *vma, unsigned long address,
>> > + pmd_t *pmdp)
>> > +{
>> > + pmd_t pmd = pmd_mksplitting(*pmdp);
>> > + VM_BUG_ON(address & ~PMD_MASK);
>> > + set_pmd_at(vma->vm_mm, address, pmdp, pmd);
>> > +
>> > + /* dummy IPI to serialise against fast_gup */
>> > + smp_call_function(thp_splitting_flush_sync, NULL, 1);
>> > +}
>>
>> do you really need to IPI the entire machine? Wouldn't the mm's TLB
>> invalidate mask be sufficient?
Hey Will,
>
> Are you thinking of using mm_cpumask(vma->vm_mm)? That's rarely cleared on
> ARM, so it tends to identify everywhere the task has ever run, regardless of
> TLB state. The reason is that the mask is also used for cache flushing
> (which is further overloaded for VIVT and VIPT w/ software maintenance
> broadcast).
For the THP splitting case, I want a cpu mask to represent any cpu
that is touching the address space belonging to the THP. That way, the
IPI will block on any fast_gups taking place that contain the THP.
>
> I had a patch improving this a bit (below) but I didn't manage to see any
> significant improvements so I didn't pursue it further. What we probably want
> to try is nuking the mask on a h/w broadcast TLBI operation with ARMv7, but
> it will mean adding horrible checks to tlbflush.h
Thanks! I'm still waking up, will have a think about this.
Cheers,
--
Steve
>
> Will
>
> --->8
>
> commit fd24d6170839b200cc2916c83847ca46e889f1ca
> Author: Will Deacon <will.deacon at arm.com>
> Date: Thu Jul 25 16:38:34 2013 +0100
>
> ARM: mm: use mm_cpumask to keep track of dirty TLBs on v7
>
> Signed-off-by: Will Deacon <will.deacon at arm.com>
>
> diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
> index def9e570199f..f2a1cb7edfca 100644
> --- a/arch/arm/include/asm/tlbflush.h
> +++ b/arch/arm/include/asm/tlbflush.h
> @@ -202,6 +202,7 @@
> #ifndef __ASSEMBLY__
>
> #include <linux/sched.h>
> +#include <asm/smp_plat.h>
>
> struct cpu_tlb_fns {
> void (*flush_user_range)(unsigned long, unsigned long, struct vm_area_struct *);
> @@ -401,6 +402,17 @@ static inline void __flush_tlb_mm(struct mm_struct *mm)
> {
> const unsigned int __tlb_flag = __cpu_tlb_flags;
>
> + if (!cache_ops_need_broadcast()) {
> + int cpu = get_cpu();
> + if (cpumask_equal(mm_cpumask(mm), cpumask_of(cpu))) {
> + cpumask_clear_cpu(cpu, mm_cpumask(mm));
> + local_flush_tlb_mm(mm);
> + put_cpu();
> + return;
> + }
> + put_cpu();
> + }
> +
> if (tlb_flag(TLB_WB))
> dsb(ishst);
>
> @@ -459,6 +471,17 @@ __flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr)
> {
> const unsigned int __tlb_flag = __cpu_tlb_flags;
>
> + if (!cache_ops_need_broadcast()) {
> + int cpu = get_cpu();
> + if (cpumask_equal(mm_cpumask(vma->vm_mm), cpumask_of(cpu))) {
> + cpumask_clear_cpu(cpu, mm_cpumask(vma->vm_mm));
> + local_flush_tlb_page(vma, uaddr);
> + put_cpu();
> + return;
> + }
> + put_cpu();
> + }
> +
> uaddr = (uaddr & PAGE_MASK) | ASID(vma->vm_mm);
>
> if (tlb_flag(TLB_WB))
More information about the linux-arm-kernel
mailing list