[EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request

Michal Kalderon mkalderon at marvell.com
Mon Jun 14 11:14:55 PDT 2021


> From: Sagi Grimberg <sagi at grimberg.me>
> Sent: Monday, June 14, 2021 7:45 PM
> 
> 
> >> OK, it seems that the issue is that we are submitting I/O in atomic
> >> context. This should be more appropriate...
> >
> > Thanks Sagi, this seems to work. I'm still hitting some other issues where in
> some cases reconnect fails, but I'm
> > Collecting more info.
> 
> Same type of failures?
No, something else. 
After recovery completes, I'm getting the following errors on initiator side without any messages on target: 
[14678.618025] nvme nvme2: Connect rejected: status -104 (reset by remote host).
[14678.619350] nvme nvme2: rdma connection establishment failed (-104)
[14678.622274] nvme nvme2: Failed reconnect attempt 6
[14678.623623] nvme nvme2: Reconnecting in 10 seconds...
[14751.304247] nvme nvme2: I/O 0 QID 0 timeout
[14751.305749] nvme nvme2: Connect command failed, error wo/DNR bit: 881
[14751.307240] nvme nvme2: failed to connect queue: 0 ret=881
[14751.310497] nvme nvme2: Failed reconnect attempt 7
[14751.312174] nvme nvme2: Reconnecting in 10 seconds...
[14825.032645] nvme nvme2: I/O 1 QID 0 timeout




More information about the Linux-nvme mailing list