[PATCH v3 2/2] RISC-V: Clean up the Zicbom block size probing
Andrew Jones
ajones at ventanamicro.com
Thu Sep 8 04:22:10 PDT 2022
On Thu, Sep 08, 2022 at 11:48:31AM +0100, Jessica Clarke wrote:
> On 8 Sept 2022, at 09:10, Heiko Stübner <heiko at sntech.de> wrote:
> >
> > Am Donnerstag, 8. September 2022, 09:11:57 CEST schrieb Andrew Jones:
> >> On Wed, Sep 07, 2022 at 03:47:09PM -0700, Atish Patra wrote:
> >>> On Tue, Sep 6, 2022 at 12:45 AM Andrew Jones <ajones at ventanamicro.com>
> >>> wrote:
> >>>
> >>>> From: Palmer Dabbelt <palmer at rivosinc.com>
> >>>>
> >>>> This fixes two issues: I truncated the warning's hart ID when porting to
> >>>> the 64-bit hart ID code, and the original code's warning handling could
> >>>> fire on an uninitialized hart ID.
> >>>>
> >>>> The biggest change here is that riscv_cbom_block_size is no longer
> >>>> initialized, as IMO the default isn't sane: there's nothing in the ISA
> >>>> that mandates any specific cache block size, so falling back to one will
> >>>> just silently produce the wrong answer on some systems. This also
> >>>> changes the probing order so the cache block size is known before
> >>>> enabling Zicbom support.
> >>>>
> >>>> Fixes: 3aefb2ee5bdd ("riscv: implement Zicbom-based CMO instructions + the
> >>>> t-head variant")
> >>>> Fixes: 1631ba1259d6 ("riscv: Add support for non-coherent devices using
> >>>> zicbom extension")
> >>>> Reported-by: kernel test robot <lkp at intel.com>
> >>>> Signed-off-by: Palmer Dabbelt <palmer at rivosinc.com>
> >>>> Reviewed-by: Conor Dooley <conor.dooley at microchip.com>
> >>>> [Rebased on Anup's move patch and applied Conor Dooley's and Heiko
> >>>> Stuebner's changes.]
> >>>> Signed-off-by: Andrew Jones <ajones at ventanamicro.com>
> >>>> ---
> >>>> arch/riscv/errata/thead/errata.c | 1 +
> >>>> arch/riscv/kernel/setup.c | 2 +-
> >>>> arch/riscv/mm/cacheflush.c | 21 +++++++++++----------
> >>>> arch/riscv/mm/dma-noncoherent.c | 2 ++
> >>>> 4 files changed, 15 insertions(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/arch/riscv/errata/thead/errata.c
> >>>> b/arch/riscv/errata/thead/errata.c
> >>>> index 202c83f677b2..96648c176f37 100644
> >>>> --- a/arch/riscv/errata/thead/errata.c
> >>>> +++ b/arch/riscv/errata/thead/errata.c
> >>>> @@ -37,6 +37,7 @@ static bool errata_probe_cmo(unsigned int stage,
> >>>> if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> >>>> return false;
> >>>>
> >>>> + riscv_cbom_block_size = L1_CACHE_BYTES;
> >>>> riscv_noncoherent_supported();
> >>>> return true;
> >>>> #else
> >>>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> >>>> index 95ef6e2bf45c..2dfc463b86bb 100644
> >>>> --- a/arch/riscv/kernel/setup.c
> >>>> +++ b/arch/riscv/kernel/setup.c
> >>>> @@ -296,8 +296,8 @@ void __init setup_arch(char **cmdline_p)
> >>>> setup_smp();
> >>>> #endif
> >>>>
> >>>> - riscv_fill_hwcap();
> >>>> riscv_init_cbom_blocksize();
> >>>> + riscv_fill_hwcap();
> >>>> apply_boot_alternatives();
> >>>> }
> >>>>
> >>>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> >>>> index 336c5deea870..e5b087be1577 100644
> >>>> --- a/arch/riscv/mm/cacheflush.c
> >>>> +++ b/arch/riscv/mm/cacheflush.c
> >>>> @@ -89,39 +89,40 @@ void flush_icache_pte(pte_t pte)
> >>>> }
> >>>> #endif /* CONFIG_MMU */
> >>>>
> >>>> -unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
> >>>> +unsigned int riscv_cbom_block_size;
> >>>>
> >>>> #ifdef CONFIG_RISCV_ISA_ZICBOM
> >>>> void riscv_init_cbom_blocksize(void)
> >>>> {
> >>>> struct device_node *node;
> >>>> + unsigned long cbom_hartid;
> >>>> + u32 val, probed_block_size;
> >>>> int ret;
> >>>> - u32 val;
> >>>>
> >>>> + probed_block_size = 0;
> >>>> for_each_of_cpu_node(node) {
> >>>> unsigned long hartid;
> >>>> - int cbom_hartid;
> >>>>
> >>>> ret = riscv_of_processor_hartid(node, &hartid);
> >>>> if (ret)
> >>>> continue;
> >>>>
> >>>> - if (hartid < 0)
> >>>> - continue;
> >>>> -
> >>>> /* set block-size for cbom extension if available */
> >>>> ret = of_property_read_u32(node, "riscv,cbom-block-size",
> >>>> &val);
> >>>> if (ret)
> >>>> continue;
> >>>>
> >>>> - if (!riscv_cbom_block_size) {
> >>>> - riscv_cbom_block_size = val;
> >>>> + if (!probed_block_size) {
> >>>> + probed_block_size = val;
> >>>> cbom_hartid = hartid;
> >>>> } else {
> >>>> - if (riscv_cbom_block_size != val)
> >>>> - pr_warn("cbom-block-size mismatched
> >>>> between harts %d and %lu\n",
> >>>> + if (probed_block_size != val)
> >>>> + pr_warn("cbom-block-size mismatched
> >>>> between harts %lu and %lu\n",
> >>>> cbom_hartid, hartid);
> >>>>
> >>>
> >>> Maybe add more info saying the first one will be selected in that case as
> >>> it is just a warning.
> >>
> >> If we detect a mismatch then should we disable the CMO extension?
> >
> > From a user's pov I'd think their system might stop working with disabled
> > cmo - for things like networking / mass storage or so.
> >
> > Also the amount of misbehaviour might depend on weather the value is
> > shrinking or expanding.
> >
> > Going from block_size x -> x/2 will "just" result in some areas being
> > handled twice, where going from x -> 2x will leave out some areas,
> > when the cpu itself still just does "x" .
> >
> >
> > An experiment on the D1 supports that thought ;-)
> > with L1_CACHE_BYTES / 2, networking keeps working
> > with L1_CACHE_BYTES * 2 (plus adapting MINALIGN) breaks networking.
> >
> >
> > So I'd think, we should loudly warn about misconfiguration anyway,
> > but could just use the smallest value as block_size (in a future patch)
> > to keep the most amounts of systems running in such a case.
>
> You need to use the smallest size for your stride in Zicbom instruction
> loops, but the largest size as your alignment and padding granularity
> for allocations, otherwise you’ll have cache line aliasing on some of
> the cores and have correctness issues (invalidating things outside your
> allocation is the obvious problem, but other allocations pulling in
> cache lines you’ve just flushed is also a problem).
>
This sounds like a good argument to me to just BUG on a DT with
mismatches, forcing the DT to get fixed.
Thanks,
drew
More information about the linux-riscv
mailing list