rcu/tree: Protect rcu_rdp_is_offloaded() invocations on RT

Frederic Weisbecker frederic at kernel.org
Tue Sep 21 16:45:18 PDT 2021


On Tue, Sep 21, 2021 at 11:12:50PM +0200, Thomas Gleixner wrote:
> Valentin reported warnings about suspicious RCU usage on RT kernels. Those
> happen when offloading of RCU callbacks is enabled:
> 
>   WARNING: suspicious RCU usage
>   5.13.0-rt1 #20 Not tainted
>   -----------------------------
>   kernel/rcu/tree_plugin.h:69 Unsafe read of RCU_NOCB offloaded state!
> 
>   rcu_rdp_is_offloaded (kernel/rcu/tree_plugin.h:69 kernel/rcu/tree_plugin.h:58)
>   rcu_core (kernel/rcu/tree.c:2332 kernel/rcu/tree.c:2398 kernel/rcu/tree.c:2777)
>   rcu_cpu_kthread (./include/linux/bottom_half.h:32 kernel/rcu/tree.c:2876)
> 
> The reason is that rcu_rdp_is_offloaded() is invoked without one of the
> required protections on RT enabled kernels because local_bh_disable() does
> not disable preemption on RT.
> 
> Valentin proposed to add a local lock to the code in question, but that's
> suboptimal in several aspects:
> 
>   1) local locks add extra code to !RT kernels for no value.
> 
>   2) All possible callsites have to audited and amended when affected
>      possible at an outer function level due to lock nesting issues.
> 
>   3) As the local lock has to be taken at the outer functions it's required
>      to release and reacquire them in the inner code sections which might
>      voluntary schedule, e.g. rcu_do_batch().
> 
> Both callsites of rcu_rdp_is_offloaded() which trigger this check invoke
> rcu_rdp_is_offloaded() in the variable declaration section right at the top
> of the functions. But the actual usage of the result is either within a
> section which provides the required protections or after such a section.
> 
> So the obvious solution is to move the invocation into the code sections
> which provide the proper protections, which solves the problem for RT and
> does not have any impact on !RT kernels.

Also while at it, I'm asking again: traditionally softirqs could assume that
manipulating a local state was safe against !irq_count() code fiddling with
the same state on the same CPU.

Now with preemptible softirqs, that assumption can be broken anytime. RCU was
fortunate enough to have a warning for that. But who knows how many issues like
this are lurking?

Thanks.



More information about the linux-arm-kernel mailing list