nvmf/rdma host crash during heavy load and keep alive recovery

Steve Wise swise at opengridcomputing.com
Wed Aug 17 12:07:51 PDT 2016


> >> If that is the case, I think we need to have a closer look at
> >> nvme_stop_queues...
> >>
> >
> > request_queue->queue_flags does have QUEUE_FLAG_STOPPED set:
> >
> > #define QUEUE_FLAG_STOPPED      2       /* queue is stopped */
> >
> > crash> request_queue.queue_flags -x 0xffff880397a13d28
> >   queue_flags = 0x1f07a04
> > crash> request_queue.mq_ops 0xffff880397a13d28
> >   mq_ops = 0xffffffffa084b140 <nvme_rdma_mq_ops>
> >
> > So it appears the queue is stopped, yet a request is being processed for
> that
> > queue.  Perhaps there is a race where QUEUE_FLAG_STOPPED is set after a
> request
> > is scheduled?
> 
> Umm. When the keep-alive timeout triggers we stop the queues. only 10
> seconds (or reconnect_delay) later we free the queues and reestablish
> them, so I find it hard to believe that a request was queued, and spent
> so long in queue_rq until we freed the queue-pair.

I agree.

> 
>  From you description of the sequence it seems that after 10 seconds we
> attempt a reconnect and during that time an IO request crashes the
> party.
>

Yes. 
 
> I assume this means you ran traffic during the sequence yes?

Mega fio test streaming to all 10 devices.  I start the following script,
and then bring the link down a few seconds later, which triggers the kato,
then 10 seconds later reconnecting starts and whamo...


for i in $(seq 1 20) ; do

         fio --ioengine=libaio --rw=randwrite --name=randwrite --size=200m
--direct=1 \
        --invalidate=1 --fsync_on_close=1 --group_reporting --exitall
--runtime=20 \
        --time_based --filename=/dev/nvme0n1 --filename=/dev/nvme1n1 \
        --filename=/dev/nvme2n1 --filename=/dev/nvme3n1
--filename=/dev/nvme4n1 \
        --filename=/dev/nvme5n1 --filename=/dev/nvme6n1
--filename=/dev/nvme7n1 \
        --filename=/dev/nvme8n1 --filename=/dev/nvme9n1 --iodepth=4
--numjobs=32 \
        --bs=2K |grep -i "aggrb\|iops"
        sleep 3
        echo "### Iteration $i Done ###"
done




More information about the Linux-nvme mailing list