lockdep WARNING at blktests block/011
Damien Le Moal
damien.lemoal at opensource.wdc.com
Tue Oct 4 15:34:55 PDT 2022
On 10/4/22 19:44, Shinichiro Kawasaki wrote:
> On Oct 03, 2022 / 09:28, Keith Busch wrote:
>> On Mon, Oct 03, 2022 at 01:32:41PM +0000, Shinichiro Kawasaki wrote:
>>>
>>> BTW, I came up to another question during code read. I found nvme_reset_work()
>>> calls nvme_dev_disable() before nvme_sync_queues(). So, I think the NVME
>>> controller is already disabled when the reset work calls nvme_sync_queues().
>>
>> Right, everything previously outstanding has been reclaimed, and the queues are
>> quiesced at this point. There's nothing for timeout work to wait for, and the
>> sync is just ensuring every timeout work has returned.
>
> Thank you Keith, good to know that we do not need to worry about the deadlock
> scenario with nvme_reset_work() :) I also checked nvme_reset_prepare() and
> nvme_suspend(). They also call nvme_dev_disable() or nvme_start/wait_freeze()
> before nvme_sync_queues(). So I think the queues are quiesced in these context
> also, and do not worry them either.
>
>>
>> It looks like a timeout is required in order to hit this reported deadlock, but
>> the driver ensures there's nothing to timeout prior to syncing the queues. I
>> don't think lockdep could reasonably know that, though.
>
> I see... From blktests point of view, I would like to suppress the lockdep splat
> which triggers the block/011 failure. I created a quick patch using
> lockdep_off() and lockdep_on() which skips lockdep check for the read lock in
> nvme_sync_io_queues() [1] and confirmed it avoids the lockdep splat. I think
> this lockdep check relax is fine because nvme_sync_io_queues() callers do call
> nvme_dev_disable(), nvme_start/wait_freeze() or nvme_stop_queues(). The queues
> are quiesced at nvme_sync_io_queues() and the deadlock scenario does not happen.
>
> Any comment on this patch will be appreciated. If this action approach is ok,
> I'll post as a formal patch for review.
>
> [1]
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 66446f1e06c..9d091327a88 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -5121,10 +5121,14 @@ void nvme_sync_io_queues(struct nvme_ctrl *ctrl)
> {
> struct nvme_ns *ns;
>
I think there should be a comment here explaining why your turn off lockdep.
> + lockdep_off();
> down_read(&ctrl->namespaces_rwsem);
> + lockdep_on();
> list_for_each_entry(ns, &ctrl->namespaces, list)
> blk_sync_queue(ns->queue);
> + lockdep_off();
> up_read(&ctrl->namespaces_rwsem);
> + lockdep_on();
> }
> EXPORT_SYMBOL_GPL(nvme_sync_io_queues);
>
--
Damien Le Moal
Western Digital Research
More information about the Linux-nvme
mailing list