[External] [PATCH v5 08/13] riscv: Avoid TLB flush loops when affected by SiFive CIP-1200

Samuel Holland samuel.holland at sifive.com
Mon Mar 11 17:35:49 PDT 2024


Hi Yunhui,

On 2024-02-29 8:48 PM, yunhui cui wrote:
> Hi Samuel,
> 
> On Fri, Mar 1, 2024 at 7:22 AM Samuel Holland <samuel.holland at sifive.com> wrote:
>>
>> Since implementations affected by SiFive errata CIP-1200 always use the
>> global variant of the sfence.vma instruction, they only need to execute
>> the instruction once. The range-based loop only hurts performance.
>>
>> Signed-off-by: Samuel Holland <samuel.holland at sifive.com>
>> ---
>>
>> (no changes since v4)
>>
>> Changes in v4:
>>  - Only set tlb_flush_all_threshold when CONFIG_MMU=y.
>>
>> Changes in v3:
>>  - New patch for v3
>>
>>  arch/riscv/errata/sifive/errata.c | 5 +++++
>>  arch/riscv/include/asm/tlbflush.h | 2 ++
>>  arch/riscv/mm/tlbflush.c          | 2 +-
>>  3 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c
>> index 3d9a32d791f7..716cfedad3a2 100644
>> --- a/arch/riscv/errata/sifive/errata.c
>> +++ b/arch/riscv/errata/sifive/errata.c
>> @@ -42,6 +42,11 @@ static bool errata_cip_1200_check_func(unsigned long  arch_id, unsigned long imp
>>                 return false;
>>         if ((impid & 0xffffff) > 0x200630 || impid == 0x1200626)
>>                 return false;
>> +
>> +#ifdef CONFIG_MMU
>> +       tlb_flush_all_threshold = 0;
>> +#endif
>> +
>>         return true;
>>  }
>>
>> diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
>> index 463b615d7728..8e329721375b 100644
>> --- a/arch/riscv/include/asm/tlbflush.h
>> +++ b/arch/riscv/include/asm/tlbflush.h
>> @@ -66,6 +66,8 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
>>                                unsigned long uaddr);
>>  void arch_flush_tlb_batched_pending(struct mm_struct *mm);
>>  void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
>> +
>> +extern unsigned long tlb_flush_all_threshold;
>>  #else /* CONFIG_MMU */
>>  #define local_flush_tlb_all()                  do { } while (0)
>>  #endif /* CONFIG_MMU */
>> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
>> index 365e0a0e4725..22870f213188 100644
>> --- a/arch/riscv/mm/tlbflush.c
>> +++ b/arch/riscv/mm/tlbflush.c
>> @@ -11,7 +11,7 @@
>>   * Flush entire TLB if number of entries to be flushed is greater
>>   * than the threshold below.
>>   */
>> -static unsigned long tlb_flush_all_threshold __read_mostly = 64;
>> +unsigned long tlb_flush_all_threshold __read_mostly = 64;
>>
>>  static void local_flush_tlb_range_threshold_asid(unsigned long start,
>>                                                  unsigned long size,
>> --
>> 2.43.1
>>
> 
> If local_flush_tlb_all_asid() is used every time, more PTWs will be
> generated. Will such modifications definitely improve the overall
> performance?

This change in this commit specifically applies to older SiFive SoCs with a bug
making single-page sfence.vma instructions unsafe to use. In this case, a single
call to local_flush_tlb_all_asid() is optimal, yes.

> Hi Alex, Samuel,
> The relationship between flush_xx_range_asid() and nr_ptes is
> basically linear growth (y=kx +b), while flush_all_asid() has nothing
> to do with nr_ptes (y=c).
> Some TLBs may do some optimization. The operation of flush all itself
> requires very few cycles, but there is a certain delay between
> consecutive flush all.
> The intersection of the two straight lines is the optimal solution of
> tlb_flush_all_threshold. In actual situations, continuous
> flush_all_asid will not occur. One problem caused by flush_all_asid()
> is that multiple flush entries require PTW, which causes greater
> latency.
> Therefore, the value of tlb_flush_all_threshold needs to be considered
> or quantified. Maybe doing local_flush_tlb_page_asid() based on the
> actual nr_ptes_in_range would give better overall performance.
> What do you think?

Yes, this was something Alex brought up when adding this threshold, that it
should be tuned for various scenarios. That still needs to be done. This patch
just covers one specific case where we know the optimal answer due to an erratum.

Regards,
Samuel




More information about the linux-riscv mailing list