bad unlock balance WARNING at nvme/045

Hannes Reinecke hare at suse.de
Fri Oct 28 06:52:32 PDT 2022


On 10/26/22 14:38, Sagi Grimberg wrote:
> 
[ .. ]
>> Hannes, thanks for the explanations.
>>
>>>
>>> Can you check if this fix works?
>>>
>>> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
>>> index c8a6db7c4498..4e824aab30eb 100644
>>> --- a/drivers/nvme/host/auth.c
>>> +++ b/drivers/nvme/host/auth.c
>>> @@ -926,7 +926,6 @@ void nvme_auth_reset(struct nvme_ctrl *ctrl)
>>>
>>>          mutex_lock(&ctrl->dhchap_auth_mutex);
>>>          list_for_each_entry(chap, &ctrl->dhchap_auth_list, entry) {
>>> -               mutex_unlock(&ctrl->dhchap_auth_mutex);
>>>                  flush_work(&chap->auth_work);
>>>                  __nvme_auth_reset(chap);
>>>          }
>>
>> I confirmed this hunk avoids the "WARNING: bad unlock balance 
>> detected!". As far
>> as I ran blktests with this change, I observe no failure in other test 
>> cases.
>>
>> However, I observed another new WARN at nvme/045: "WARNING: possible 
>> recursive
>> locking detected". I think it was caused by nvme_dhchap_auth_work in 
>> nvme_wq
>> tried to flush another work __nvme_auth_work in the same workqueue. I 
>> created a
>> patch below which creates another workqueue nvme_auth_wq for 
>> __nvme_auth_work.
>> Do you think this fix approach is acceptable?
> 
> It is fine to flush work on the same workqueue, the problem is that they
> share the same lock.
> 
> I think what we should do is:
> 1. have chap contexts in an array and not a list so we don't need
> to traverse to locate a context.
> 2. omit the dhchap_auth_mutex as I don't see a need to protect the
> array.
> 3. synchronize only the chap context itself, either with a lock, or
> a flag, or a state (atomic).
> 4. get rid of long lived allocations, it is not optimizing a whole lot
> imho, and right now we are reserving something like 5k per queue.
> 5. add one more synchronization between chap and ctrl accessing the keys
> 

I've now send a patchset which should addressing this; can you check if
that's what you had in mind?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare at suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer




More information about the Linux-nvme mailing list