[PATCH v1] rcu: Fix and improve RCU read lock checks when !CONFIG_DEBUG_LOCK_ALLOC
Gao Xiang
hsiangkao at linux.alibaba.com
Thu Jul 13 12:00:25 PDT 2023
On 2023/7/14 02:14, Paul E. McKenney wrote:
> On Fri, Jul 14, 2023 at 12:09:27AM +0800, Alan Huang wrote:
...
>>>
>>> From what Sandeep described, the code path is in an RCU reader. My
>>> question is more, why doesn't it use SRCU instead since it clearly
>>> does so if BLK_MQ_F_BLOCKING. What are the tradeoffs? IMHO, a deeper
>>> dive needs to be made into that before concluding that the fix is to
>>> use rcu_read_lock_any_held().
>>
>> Copied from [1]:
>>
>> "Background: Historically erofs would always schedule a kworker for
>> decompression which would incur the scheduling cost regardless of
>> the context. But z_erofs_decompressqueue_endio() may not always
>> be in atomic context and we could actually benefit from doing the
>> decompression in z_erofs_decompressqueue_endio() if we are in
>> thread context, for example when running with dm-verity.
>> This optimization was later added in patch [2] which has shown
>> improvement in performance benchmarks.”
>>
>> I’m not sure if it is a design issue.
What do you mean a design issue, honestly? I feel uneasy to hear this.
>
> I have no official opinion myself, but there are quite a few people
> who firmly believe that any situation like this one (where driver or
> file-system code needs to query the current context to see if blocking
> is OK) constitutes a design flaw. Such people might argue that this
> code path should have a clearly documented context, and that if that
> documentation states that the code might be in atomic context, then the
> driver/fs should assume atomic context. Alternatively, if driver/fs
I don't think a software decoder (for example, decompression) should be
left in the atomic context honestly.
Regardless of the decompression speed of some algorithm theirselves
(considering very slow decompression on very slow devices), it means
that we also don't have a way to vmap or likewise (considering
decompression + handle extra deduplication copies) in the atomic
context, even memory allocation has to be in an atomic way.
Especially now have more cases that decodes in the RCU reader context
apart from softirq contexts?
> needs the context to be non-atomic, the callers should make it so.
>
> See for example in_atomic() and its comment header:
>
> /*
> * Are we running in atomic context? WARNING: this macro cannot
> * always detect atomic context; in particular, it cannot know about
> * held spinlocks in non-preemptible kernels. Thus it should not be
> * used in the general case to determine whether sleeping is possible.
> * Do not use in_atomic() in driver code.
> */
> #define in_atomic() (preempt_count() != 0)
>
> In the immortal words of Dan Frye, this should be good clean fun! ;-)
Honestly, I think such helper (to show whether it's in the atomic context)
is useful to driver users, even it could cause some false positive in some
configuration but it's acceptable.
Anyway, I could "Always use a workqueue.", but again, the original commit
was raised by a real vendor (OPPO), and I think if dropping this, all
downstream users which use dm-verity will be impacted and individual end
users will not be happy as well.
Thanks,
Gao Xiang
More information about the linux-arm-kernel
mailing list