[PATCH v12] NVMe: Convert to blk-mq
Matias Bjørling
m at bjorling.me
Thu Aug 21 05:07:13 PDT 2014
On 08/19/2014 12:49 AM, Keith Busch wrote:
> On Fri, 15 Aug 2014, Matias Bjørling wrote:
>>
>> * NVMe queues are merged with the tags structure of blk-mq.
>>
>
> I see the driver's queue suspend logic is removed, but I didn't mean to
> imply it was safe to do so without replacing it with something else. I
> thought maybe we could use the blk_stop/start_queue() functions if I'm
> correctly understanding what they're for.
They're usually only used for the previous request model.
Please correct me if I'm wrong. The flow of suspend is as following
(roughly):
1. Freeze user threads
2. Perform sys_sync
3. Freeze freezable kernel threads
4. Freeze devices
5. ...
On nvme suspend, we process all outstanding request and cancels any
outstanding IOs, before going suspending.
From what I found, is it still possible for IOs to be submitted and
lost in the process?
>
> With what's in version 12, we could free an irq multiple times that
> doesn't even belong to the nvme queue anymore in certain error conditions.
>
> A couple other things I just noticed:
>
> * We lose the irq affinity hint after a suspend/resume or device reset
> because the driver's init_hctx() isn't called in these scenarios.
Ok, you're right.
>
> * After a reset, we are not guaranteed that we even have the same number
> of h/w queues. The driver frees ones beyond the device's capabilities,
> so blk-mq may have references to freed memory. The driver may also
> allocate more queues if it is capable, but blk-mq won't be able to take
> advantage of that.
Ok. Out of curiosity, why can the number of exposed nvme queues change
from the hw perspective on suspend/resume?
More information about the Linux-nvme
mailing list