NVMe induced NULL deref in bt_iter()

Mon Jul 3 05:46:34 PDT 2017

On 7/3/2017 3:03 PM, Ming Lei wrote:
> On Mon, Jul 03, 2017 at 01:07:44PM +0300, Sagi Grimberg wrote:
>> Hi Ming,
>>
>>> Yeah, the above change is correct, for any canceling requests in this
>>> way we should use blk_mq_quiesce_queue().
>>
>> I still don't understand why should blk_mq_flush_busy_ctxs hit a NULL
>> deref if we don't touch the tagset...
>
> Looks no one mentioned the steps for reproduction, then it isn't easy
> to understand the related use case, could anyone share the steps for
> reproduction?

Hi Ming,
I create 500 ns per 1 subsystem (using with CX4 target and C-IB 
initiator but also saw it in CX5 vs. CX5 setup).
The null deref happens when I remove all configuration in the target (1 
port 1 subsystem and 500 namespaces and nvmet modules unload) during 
traffic to 1 nvme device/ns from the intiator.
I get Null deref in blk_mq_flush_busy_ctxs function that calls 
sbitmap_for_each_set in the initiator. seems like the "struct 
sbitmap_word *word = &sb->map[i];" is null. It's actually might be not 
null in the beginning of the func and become null during running the 
while loop there.

>
>>
>> Also, I'm wandering in what case we shouldn't use
>> blk_mq_quiesce_queue()? Maybe we should unexport blk_mq_stop_hw_queues()
>> and blk_mq_start_stopped_hw_queues() and use the quiesce/unquiesce
>> equivalent always?
>
> There are at least one case in which we have to use stop queues:
>
> 	- when QUEUE_BUSY(now it becomes BLK_STS_RESOURCE) happens, some drivers
> 	need to stop queues for avoiding to hurt CPU, such as virtio-blk, ...
>
>>
>> The only fishy usage is in nvme_fc_start_fcp_op() where if submission
>> failed the code stop the hw queues and delays it, but I think it should
>> be handled differently..
>
> It looks like the old way of scsi-mq, but scsi has removed this way and
> avoids to stop queue.
>
>
> Thanks,
> Ming
>