Tickless timer regression in v5.18.3 on riscv
Anup Patel
anup at brainfault.org
Thu Jun 23 20:59:40 PDT 2022
+Atish (for Sstc patches), +Sunil (for ACPI support)
On Fri, Jun 24, 2022 at 1:21 AM Samuel Holland <samuel at sholland.org> wrote:
>
> On 6/23/22 1:11 PM, Sergey Larin wrote:
> > Hello,
> >
> > It appears that tickless timer regression was introduced in commit
> > 232ccac1bd9b5bfe73895f527c08623e7fa0752d ("clocksource/drivers/riscv: Events are
> > stopped during CPU suspend").
> >
> > Even though this is a fix for CPU suspend, it causes high idle CPU usage.
> > However, this usage is not reported to userspace, but can be observed in virtual
> > machines like QEMU or RVVM. So the kernel is not tickless now, disabling NO_HZ
> > has the same effects.
>
> The SBI timer extension is not a great clockevent source to start with. It
> requires calling in to M-mode/the hypervisor, both for setting the timer and
> (without Sstc) a trap for handling the interrupt. That is why its rating is only
> 100.
When Sstc is available, the clockevent device rating should be increased (400+).
(@Atish, can you update this in your Linux Sstc series?)
>
> Do your virtual machines have no other MMIO timer device? Practically any other
> clockevent device will be used in preference to this one. And if that other
> device does not have the CLOCK_EVT_FEAT_C3STOP flag, then idle CPU usage will
> not be increased.
I agree. RISC-V Platforms that don't preserve mtimer (MMIO) should have
additional timer in the "always-on" domain for idle wake-up.
>
> > Any ideas why this is needed since CLINT timers should be active in WFI?
>
> The CLINT timers will be active in WFI, but not necessarily in all CPU idle
> states. For example, the T-HEAD C9xx includes the CLINT inside the CPU's reset
> domain[1]. So a nonretentive idle state that puts the CPU in reset will stop the
> CLINT (in addition to resetting the circuitry responsible for delivering the
> timer interrupt).
>
> Also keep in mind that the SBI time extension is not necessarily backed by a
> CLINT. In fact, the SBI specification lists no real requirements for the timer,
> such as minimum/maximum duration or when the timer is allowed to stop. So the
> kernel has to assume the worst case.
Yes, when Sstc is available the SBI time extension will internally use stimecmp
for SW compatibility (recommended by the Sstc specification).
For assuming the worst case and always setting C3STOP in timer-riscv is fine
since we have very few platforms implementing CPU idle states but this will
change soon.
I suggest we introduce an optional "riscv,timer-always-on" DT property for each
HART. If all HART DT nodes have this property then we don't set C3STOP flag
in clockevent device, else we set it.
(Note: Our proposed ACPI ECRs all need this flag in HART capabilities table)
Regards,
Anup
>
> Regards,
> Samuel
>
> [1]:
> https://github.com/T-head-Semi/openc906/blob/main/C906_RTL_FACTORY/gen_rtl/clint/rtl/clint_func.v#L369
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
More information about the linux-riscv
mailing list