[PATCH] arm64: mm: drop tlb flush operation when clearing the access bit

Baolin Wang baolin.wang at linux.alibaba.com
Tue Oct 24 18:44:34 PDT 2023



On 10/24/2023 9:48 PM, Kefeng Wang wrote:
> 
> 
> On 2023/10/24 20:56, Baolin Wang wrote:
>> Now ptep_clear_flush_young() is only called by folio_referenced() to
>> check if the folio was referenced, and now it will call a tlb flush on
>> ARM64 architecture. However the tlb flush can be expensive on ARM64
>> servers, especially for the systems with a large CPU numbers.
>>
>> Similar to the x86 architecture, below comments also apply equally to
>> ARM64 architecture. So we can drop the tlb flush operation in
>> ptep_clear_flush_young() on ARM64 architecture to improve the 
>> performance.
>> "
>> /* Clearing the accessed bit without a TLB flush
>>   * doesn't cause data corruption. [ It could cause incorrect
>>   * page aging and the (mistaken) reclaim of hot pages, but the
>>   * chance of that should be relatively low. ]
>>   *
>>   * So as a performance optimization don't flush the TLB when
>>   * clearing the accessed bit, it will eventually be flushed by
>>   * a context switch or a VM operation anyway. [ In the rare
>>   * event of it not getting flushed for a long time the delay
>>   * shouldn't really matter because there's no real memory
>>   * pressure for swapout to react to. ]
>>   */
>> "
>> Running the thpscale to show some obvious improvements for compaction
>> latency with this patch:
>>                               base                   patched
>> Amean     fault-both-1      1093.19 (   0.00%)     1084.57 *   0.79%*
>> Amean     fault-both-3      2566.22 (   0.00%)     2228.45 *  13.16%*
>> Amean     fault-both-5      3591.22 (   0.00%)     3146.73 *  12.38%*
>> Amean     fault-both-7      4157.26 (   0.00%)     4113.67 *   1.05%*
>> Amean     fault-both-12     6184.79 (   0.00%)     5218.70 *  15.62%*
>> Amean     fault-both-18     9103.70 (   0.00%)     7739.71 *  14.98%*
>> Amean     fault-both-24    12341.73 (   0.00%)    10684.23 *  13.43%*
>> Amean     fault-both-30    15519.00 (   0.00%)    13695.14 *  11.75%*
>> Amean     fault-both-32    16189.15 (   0.00%)    14365.73 *  11.26%*
>>                         base       patched
>> Duration User         167.78      161.03
>> Duration System      1836.66     1673.01
>> Duration Elapsed     2074.58     2059.75
>>
>> Barry Song submitted a similar patch [1] before, that replaces the
>> ptep_clear_flush_young_notify() with ptep_clear_young_notify() in
>> folio_referenced_one(). However, I'm not sure if removing the tlb flush
>> operation is applicable to every architecture in kernel, so dropping
>> the tlb flush for ARM64 seems a sensible change.
> 
> At least x86/s390/riscv/powerpc already do it, also I think we could

Right.

> change pmdp_clear_flush_young_notify() too, since it is same with
> ptep_clear_flush_young_notify(),

Perhaps yes, but I'm still unsure if removing tlb flush for PMD entry is 
applicable to all architectures. Let's see the discussion in this 
thread. Thanks.



More information about the linux-arm-kernel mailing list