[PATCH v2 1/2] iopoll: Call cpu_relax() in busy loops

Tony Lindgren tony at atomide.com
Wed May 10 23:48:39 PDT 2023


* Geert Uytterhoeven <geert+renesas at glider.be> [230510 13:23]:
> It is considered good practice to call cpu_relax() in busy loops, see
> Documentation/process/volatile-considered-harmful.rst.  This can not
> only lower CPU power consumption or yield to a hyperthreaded twin
> processor, but also allows an architecture to mitigate hardware issues
> (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the
> architecture-specific cpu_relax() implementation.
> 
> In addition, cpu_relax() is also a compiler barrier.  It is not
> immediately obvious that the @op argument "function" will result in an
> actual function call (e.g. in case of inlining).
> 
> Where a function call is a C sequence point, this is lost on inlining.
> Therefore, with agressive enough optimization it might be possible for
> the compiler to hoist the:
> 
>         (val) = op(args);
> 
> "load" out of the loop because it doesn't see the value changing. The
> addition of cpu_relax() would inhibit this.
> 
> As the iopoll helpers lack calls to cpu_relax(), people are sometimes
> reluctant to use them, and may fall back to open-coded polling loops
> (including cpu_relax() calls) instead.
> 
> Fix this by adding calls to cpu_relax() to the iopoll helpers:
>   - For the non-atomic case, it is sufficient to call cpu_relax() in
>     case of a zero sleep-between-reads value, as a call to
>     usleep_range() is a safe barrier otherwise.  However, it doesn't
>     hurt to add the call regardless, for simplicity, and for similarity
>     with the atomic case below.
>   - For the atomic case, cpu_relax() must be called regardless of the
>     sleep-between-reads value, as there is no guarantee all
>     architecture-specific implementations of udelay() handle this.

Reviewed-by: Tony Lindgren <tony at atomide.com>



More information about the linux-arm-kernel mailing list