[EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
Michal Kalderon
mkalderon at marvell.com
Mon Jun 14 11:14:55 PDT 2021
> From: Sagi Grimberg <sagi at grimberg.me>
> Sent: Monday, June 14, 2021 7:45 PM
>
>
> >> OK, it seems that the issue is that we are submitting I/O in atomic
> >> context. This should be more appropriate...
> >
> > Thanks Sagi, this seems to work. I'm still hitting some other issues where in
> some cases reconnect fails, but I'm
> > Collecting more info.
>
> Same type of failures?
No, something else.
After recovery completes, I'm getting the following errors on initiator side without any messages on target:
[14678.618025] nvme nvme2: Connect rejected: status -104 (reset by remote host).
[14678.619350] nvme nvme2: rdma connection establishment failed (-104)
[14678.622274] nvme nvme2: Failed reconnect attempt 6
[14678.623623] nvme nvme2: Reconnecting in 10 seconds...
[14751.304247] nvme nvme2: I/O 0 QID 0 timeout
[14751.305749] nvme nvme2: Connect command failed, error wo/DNR bit: 881
[14751.307240] nvme nvme2: failed to connect queue: 0 ret=881
[14751.310497] nvme nvme2: Failed reconnect attempt 7
[14751.312174] nvme nvme2: Reconnecting in 10 seconds...
[14825.032645] nvme nvme2: I/O 1 QID 0 timeout
More information about the Linux-nvme
mailing list