[PATCH v5 4/4] ARM: Add support for Hisilicon Kunpeng L3 cache controller

Arnd Bergmann arnd at kernel.org
Fri Jan 29 03:16:03 EST 2021


On Fri, Jan 29, 2021 at 8:23 AM Leizhen (ThunderTown)
<thunder.leizhen at huawei.com> wrote:
> On 2021/1/28 22:24, Arnd Bergmann wrote:
> > On Sat, Jan 16, 2021 at 4:27 AM Zhen Lei <thunder.leizhen at huawei.com> wrote:
> >> diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
> >> +
> >> +static void l3cache_maint_common(u32 range, u32 op_type)
> >> +{
> >> +       u32 reg;
> >> +
> >> +       reg = readl(l3_ctrl_base + L3_MAINT_CTRL);
> >> +       reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK);
> >> +       reg |= range | op_type;
> >> +       reg |= L3_MAINT_STATUS_START;
> >> +       writel(reg, l3_ctrl_base + L3_MAINT_CTRL);
> >
> > Are there contents of L3_MAINT_CTRL that need to be preserved
> > across calls and can not be inferred? A 'readl()' is often expensive,
> > so it might be more efficient if you can avoid that.
>
> Right, this readl() can be replaced with readl_relaxed(). Thanks.
>
> I'll check and correct the readl() and writel() in other places.

What I meant is that if you want to replace them, you should provide
performance numbers that show how much difference this makes
and add comments in the source code explaining how you proved that
the _relaxed() version is actually correct.

> >> +static inline void l3cache_flush_all_nolock(void)
> >> +{
> >> +       l3cache_maint_common(L3_MAINT_RANGE_ALL, L3_MAINT_TYPE_FLUSH);
> >> +}
> >> +
> >> +static void l3cache_flush_all(void)
> >> +{
> >> +       unsigned long flags;
> >> +
> >> +       spin_lock_irqsave(&l3cache_lock, flags);
> >> +       l3cache_flush_all_nolock();
> >> +       spin_unlock_irqrestore(&l3cache_lock, flags);
> >> +}
> >
> > I see that cache-l2x0 uses raw_spin_lock_irqsave() instead of
> > spin_lock_irqsave(), to avoid preemption in the middle of a cache
> > operation. This is probably a good idea here as well.
>
> I don't think there's any essential difference between the two! I don't know
> if the compiler or tool will do anything extra. I checked the git log of the
> l2x0 driver and it used raw_spin_lock_irqsave() at the beginning. Maybe
> there's a description in 2.6. Since you mentioned this potential risk, I'll
> change it to raw_spin_lock_irqsave.

> include/linux/spinlock.h:
> static __always_inline raw_spinlock_t *spinlock_check(spinlock_t *lock)
> {
>         return &lock->rlock;
> }
>
> #define spin_lock_irqsave(lock, flags)                          \
> do {                                                            \
>         raw_spin_lock_irqsave(spinlock_check(lock), flags);     \
> } while (0)

The spin_lock_irqsave() definition is one of the things that differs
with CONFIG_PREEMPT_RT=y, where it uses a mutex instead.

See https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/log/?h=linux-5.11.y-rt-rebase
for the final missing patches including the one that changes the
spinlock definition and some patches that change a few other spin_lock
to raw_spin_lock.

> >> +static int __init l3cache_init(void)
> >> +{
> >> +       u32 reg;
> >> +       struct device_node *node;
> >> +
> >> +       node = of_find_matching_node(NULL, l3cache_ids);
> >> +       if (!node)
> >> +               return -ENODEV;
> >
> > I think the initcall should return '0' to indicate success when running
> > a kernel with this driver built-in on a platform that does not have
> > this device.
>
> I have added "depends on ARCH_KUNPENG50X" for this driver. But it's OK to
> return 0.

Note that the "depends on ARCH_KUNPENG50X" is not relevant here, since
it only prevents you from enabling the driver on kernels that explicitly exclude
the kunpeng platform, but it has no significance to what you are actually
running on. The "multi_v7_defconfig" file should have all actively maintained
armv7 platforms enabled, similar to how common distros ship their kernels.

        Arnd



More information about the linux-arm-kernel mailing list