Excessive TLB flush ranges
Thomas Gleixner
tglx at linutronix.de
Wed May 17 04:58:44 PDT 2023
On Wed, May 17 2023 at 13:26, Uladzislau Rezki wrote:
> On Tue, May 16, 2023 at 07:04:34PM +0200, Thomas Gleixner wrote:
>> The proposed flush_tlb_kernel_vas(list, num_pages) mechanism
>> achieves:
>>
>> 1) It batches multiple ranges to _one_ invocation
>>
>> 2) It lets the architecture decide based on the number of pages
>> whether it does a tlb_flush_all() or a flush of individual ranges.
>>
>> Whether the architecture uses IPIs or flushes only locally and the
>> hardware propagates that is completely irrelevant.
>>
>> Right now any coalesced range, which is huge due to massive holes, takes
>> decision #2 away.
>>
>> If you want to flush individual VAs from the core vmalloc code then you
>> lose #1, as the aggregated number of pages might justify a tlb_flush_all().
>>
>> That's a pure architecture decision and all the core code needs to do is
>> to provide appropriate information and not some completely bogus request
>> to flush 17312759359 pages, i.e. a ~64.5 TB range, while in reality
>> there are exactly _three_ distinct pages to flush.
>>
> 1.
>
> I think, all two cases(logic) should be moved into ARCH code, so a decision
> is made _not_ by vmalloc code how to flush, either fully, if it supported or
> page by page that require list chasing.
Which is exactly what my patch does, no?
> 2.
> void vfree(const void *addr)
> {
> ...
> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
> vm_reset_perms(vm); <----
> ...
> }
>
> so, all purged areas are drained in a caller context, so it is blocked
> until the drain is done including flushing. I am not sure why it is done
> from a caller context.
>
> IMHO, it should be deferred same way as we do in:
>
> static void free_vmap_area_noflush(struct vmap_area *va)
How is that avoiding the problem? It just deferres it to some point in
the future. There is no guarantee that batching will be large enough to
justify a full flush.
> if do not miss the point why vfree() has to do it directly.
Keeping executable mappings around until some other flush happens is
obviously neither a brilliant idea nor correct.
Thanks
tglx
More information about the linux-arm-kernel
mailing list