Excessive TLB flush ranges

Thomas Gleixner tglx at linutronix.de
Mon May 15 23:37:18 PDT 2023


On Mon, May 15 2023 at 22:31, Russell King wrote:
> On Mon, May 15, 2023 at 11:11:45PM +0200, Thomas Gleixner wrote:
>> But that's not necessarily true for ARM32 as there are no IPIs involved
>> on the machine we are using, which is a dual-core Cortex-A9.
>> 
>> So I came up with the hack below, which is equally fast as the full
>> flush variant while the performance impact on the other CPUs is minimally
>> lower according to perf.
>> 
>> That probably should have another argument which tells how many TLBs
>> this flush affects, i.e. 3 in this example, so an architecture can
>> sensibly decide whether it wants to use flush all or not.
>> @@ -1747,7 +1748,12 @@ static bool __purge_vmap_area_lazy(unsig
>>  		list_last_entry(&local_purge_list,
>>  			struct vmap_area, list)->va_end);
>>  
>> -	flush_tlb_kernel_range(start, end);
>> +	if (tmp.va_end > tmp.va_start)
>> +		list_add(&tmp.list, &local_purge_list);
>> +	flush_tlb_kernel_vas(&local_purge_list);
>> +	if (tmp.va_end > tmp.va_start)
>> +		list_del(&tmp.list);
>
> So basically we end up iterating over each VA range, which seems
> sensible if the range is large and we have to iterate over it page
> by page.

Right.

> In the case you have, are "start" and "end" set on function entry
> to a range, or are they set to ULONG_MAX,0 ? What I'm wondering is
> whether we could get away with just having flush_tlb_kernel_vas().
>
> Whether that's acceptable to others is a different question :)

As I said flush_tlb_kernel_vas() should be

void flush_tlb_kernel_vas(struct list_head *list, unsigned int num_entries):

So that an architecture can decide whether it's worth to do walk the
entries or whether it resorts to a flush all.

>> +static void do_flush_vas(void *arg)
>> +{
>> +	struct list_head *list = arg;
>> +	struct vmap_area *va;
>> +	unsigned long addr;
>> +
>> +	list_for_each_entry(va, list, list) {
>> +		/* flush range by one by one 'invlpg' */
>> +		for (addr = va->va_start; addr < va->va_end; addr += PAGE_SIZE)
>> +			flush_tlb_one_kernel(addr);
>
> Isn't this just the same as:
> 	flush_tlb_kernel_range(va->va_start, va->va_end);

Indeed.

Thanks,

        tglx



More information about the linux-arm-kernel mailing list