[PATCH] nvme: fix (S)RCU protection of nvme_ns_head list (alternate)

Chao Leng lengchao at huawei.com
Thu Dec 1 17:21:17 PST 2022



On 2022/12/2 5:17, Caleb Sander wrote:
> On Wed, Nov 30, 2022 at 12:40 AM Sagi Grimberg <sagi at grimberg.me> wrote:
>>
>>
>>>> I understand what you mean in general, but in this particular case
>>>> I don't understand what is not working.
>>>
>>> How does this work?
>>
>> Particularly, the sleeping contexts are guaranteed not to dereference
>> this NS after the two previous srcu synchronization steps so at this
>> point, the only protection of nvme_ns_remove is against this
>> non-sleepable traversal, which should be enough to protect with rcu.
> 
> Can you help me understand the safety here?
> The namespace will be dereferenced when traversing the siblings list,
> which is protected by SRCU.
> But nvme_ns_remove() only synchronizes with RCU between removing the namespace
> from the siblings list and freeing the namespace.
> So it seems like there's a race here:
> Thread A:                               Thread B:
> nvme_ns_remove() executes
> past the last synchronize_srcu()
>                                          nvme_ns_head_submit_bio()
>                                          calls srcu_read_lock(),
>                                          starts traversing siblings list,
>                                          holds pointer to ns
> Removes ns from siblings list
> Calls synchronize_rcu()
> (does not block for the SRCU reader)
del_gendisk will wait all requests to be completed, "use after free" will do not happen.
> nvme_put_ns() reaches 0 references,
> frees ns
>                                          Dereferences ns to continue traversal
>                                          => USE AFTER FREE
> 
> .
> 



More information about the Linux-nvme mailing list