[RFC PATCH] nvme: fix RCU hole that allowed for endless looping in multipath round robin

John Meneghini jmeneghi at redhat.com
Tue Apr 5 06:14:16 PDT 2022


I just want to report back that our partner has tested this patch and verified that it fixes the soft-lockup problems.

They are unable to reproduce the problem when running the same tests on the same testbed when using this patch.

/John

On 3/23/22 15:07, John Meneghini wrote:
> This is good. I'm glad everyone agrees.
> 
> Chris, please ask our partner to test this patch in their testbed and verify that
> this fixes the soft-lockup problem they are seeing. I believe they have the ability
> to reproduce this problem on their 8.6 testbed.
> 
> If that works then please re-post a patch so people can consider this for v5.18.
> 
> /John
> 
> 
> On 3/23/22 11:34, Christoph Hellwig wrote:
>> On Wed, Mar 23, 2022 at 04:54:26PM +0200, Sagi Grimberg wrote:
>>>
>>>
>>> On 3/22/22 00:43, Chris Leech wrote:
>>>> Make nvme_ns_remove match the assumptions elsewhere.
>>>>
>>>> 1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is
>>>>      running in __nvme_find_path or nvme_round_robin_path that will
>>>>      re-assign this ns to current_path.
>>>>
>>>> 2) Any matching current_path entries need to be cleared before removing
>>>>      from the siblings list, to prevent calling nvme_round_robin_path with
>>>>      an "old" ns that's off list.
>>>>
>>>> 3) Finally the list_del_rcu can happen, and then synchronize again
>>>>      before releasing any reference counts.
>>>> ---
>>>>    drivers/nvme/host/core.c | 13 +++++++++----
>>>>    1 file changed, 9 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>> index fd4720d37cc0..20778dc9224c 100644
>>>> --- a/drivers/nvme/host/core.c
>>>> +++ b/drivers/nvme/host/core.c
>>>> @@ -3917,6 +3917,15 @@ static void nvme_ns_remove(struct nvme_ns *ns)
>>>>        set_capacity(ns->disk, 0);
>>>>        nvme_fault_inject_fini(&ns->fault_inject);
>>>>    +    /* ensure that !NVME_NS_READY is seen
>>>> +     * to prevent this ns going back in current_path
>>>> +     */
>>>> +    synchronize_srcu(&ns->head->srcu);
>>>> +
>>>> +    /* wait for concurrent submissions */
>>>> +    if (nvme_mpath_clear_current_path(ns))
>>>> +        synchronize_srcu(&ns->head->srcu);
>>>
>>> Nothing prevents it from being reselected again.
>>> This is what drove the placement of this after the
>>> ns is removed from the head->list. But that was before
>>> the selection looked into NVME_NS_READY flag...
>>>
>>> This looks legit to me...
>>
>> Yes, this looks pretty sensible.  I'm tempted to just queue it up
>> for 5.18.
>>
> 




More information about the Linux-nvme mailing list