lockdep WARNING at blktests block/011

Damien Le Moal damien.lemoal at opensource.wdc.com
Tue Oct 4 15:34:55 PDT 2022


On 10/4/22 19:44, Shinichiro Kawasaki wrote:
> On Oct 03, 2022 / 09:28, Keith Busch wrote:
>> On Mon, Oct 03, 2022 at 01:32:41PM +0000, Shinichiro Kawasaki wrote:
>>>
>>> BTW, I came up to another question during code read. I found nvme_reset_work()
>>> calls nvme_dev_disable() before nvme_sync_queues(). So, I think the NVME
>>> controller is already disabled when the reset work calls nvme_sync_queues().
>>
>> Right, everything previously outstanding has been reclaimed, and the queues are
>> quiesced at this point. There's nothing for timeout work to wait for, and the
>> sync is just ensuring every timeout work has returned.
> 
> Thank you Keith, good to know that we do not need to worry about the deadlock
> scenario with nvme_reset_work() :)  I also checked nvme_reset_prepare() and
> nvme_suspend(). They also call nvme_dev_disable() or nvme_start/wait_freeze()
> before nvme_sync_queues(). So I think the queues are quiesced in these context
> also, and do not worry them either.
> 
>>
>> It looks like a timeout is required in order to hit this reported deadlock, but
>> the driver ensures there's nothing to timeout prior to syncing the queues. I
>> don't think lockdep could reasonably know that, though.
> 
> I see... From blktests point of view, I would like to suppress the lockdep splat
> which triggers the block/011 failure. I created a quick patch using
> lockdep_off() and lockdep_on() which skips lockdep check for the read lock in
> nvme_sync_io_queues() [1] and confirmed it avoids the lockdep splat. I think
> this lockdep check relax is fine because nvme_sync_io_queues() callers do call
> nvme_dev_disable(), nvme_start/wait_freeze() or nvme_stop_queues(). The queues
> are quiesced at nvme_sync_io_queues() and the deadlock scenario does not happen.
> 
> Any comment on this patch will be appreciated. If this action approach is ok,
> I'll post as a formal patch for review.
> 
> [1]
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 66446f1e06c..9d091327a88 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -5121,10 +5121,14 @@ void nvme_sync_io_queues(struct nvme_ctrl *ctrl)
>  {
>         struct nvme_ns *ns;
> 

I think there should be a comment here explaining why your turn off lockdep.

> +       lockdep_off();
>         down_read(&ctrl->namespaces_rwsem);
> +       lockdep_on();
>         list_for_each_entry(ns, &ctrl->namespaces, list)
>                 blk_sync_queue(ns->queue);
> +       lockdep_off();
>         up_read(&ctrl->namespaces_rwsem);
> +       lockdep_on();
>  }
>  EXPORT_SYMBOL_GPL(nvme_sync_io_queues);
> 

-- 
Damien Le Moal
Western Digital Research




More information about the Linux-nvme mailing list