[PATCH 0/4] nvme-blkmq fixes

Mon Jan 5 12:20:47 PST 2015

On 01/05/2015 01:19 PM, Keith Busch wrote:
> On Mon, 5 Jan 2015, Jens Axboe wrote:
>> On 01/05/2015 08:17 AM, Keith Busch wrote:
>>> On Wed, 31 Dec 2014, Jens Axboe wrote:
>>> We need the driver to temporarily block tasks allocating new requests
>>> but
>>> let existing requests requeue. Freeze looked good, but unfreeze expects
>>> the usage count to have been 0, which it's not guaranteed with when we
>>> let failed requests requeue.
>>
>> OK, I think that is a concern we can fix. And yes, that was the intended
>> use case for it originally.
>>
> 
> Okay cool, that would help a lot.
> 
>>> We also need the driver to temporarily prevent the block layer from
>>> submitting requests to the driver's hw queues. 'blk_mq_stop_hw_queues'
>>> looked right, but anyone can restart them at the wrong time by kicking
>>> the requeue list.
>>
>> The driver is the only one that should kick the requeue action into
>> gear, which would start those queues up again. So that should be under
>> your control already.
> 
> Right, we can stop the driver from kicking if it knows a device reset
> is occuring, but there can't be any requeue work prior to stopping all
> h/w queues to prevent a race condition. We could have the driver's reset
> handler call 'cancel_work_sync(q->requeue_work)' to address that. There's
> no existing driver using q->reset_work, but it looks safe to treat
> as public.

Assuming you meant q->requeue_work here. I'd just add that to an
exported function for that functionality, don't muck with it directly in
case we want to change it later for some reason.

-- 
Jens Axboe