[PATCH 2/6] nvme-pci: fix the freeze and quiesce for shutdown and reset case
jianchao.wang
jianchao.w.wang at oracle.com
Thu Feb 8 17:41:59 PST 2018
Hi Keith
Thanks for your precious time and kindly response.
On 02/08/2018 11:15 PM, Keith Busch wrote:
> On Thu, Feb 08, 2018 at 10:17:00PM +0800, jianchao.wang wrote:
>> There is a dangerous scenario which caused by nvme_wait_freeze in nvme_reset_work.
>> please consider it.
>>
>> nvme_reset_work
>> -> nvme_start_queues
>> -> nvme_wait_freeze
>>
>> if the controller no response, we have to rely on the timeout path.
>> there are issues below:
>> nvme_dev_disable need to be invoked.
>> nvme_dev_disable will quiesce queues, cancel and requeue and outstanding requests.
>> nvme_reset_work will hang at nvme_wait_freeze
>
> We used to not requeue timed out commands, so that wasn't a problem
> before. Oh well, I'll take a look.
>
Yes, we indeed don't requeue the timed out commands, but nvme_dev_disable will requeue the other
outstanding requests and quiesce the request queues, this will block the nvme_reset_work->nvme_wati_freeze
to move forward.
As I shared in last email, can we use(or abuse?) blk_set_preempt_only to gate the new bios on generic_make_request ?
Freezing queues is good, but wait_freeze in reset_work is a devil.
Many thanks
Jianchao
More information about the Linux-nvme
mailing list