[PATCH 1/3] nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
Max Gurtovoy
maxg at mellanox.com
Wed Feb 14 06:20:38 PST 2018
On 2/14/2018 3:40 PM, Sagi Grimberg wrote:
>
>> During port toggle with traffic (using dm-multipath) I see some
>> warnings during ib_destroy_qp that say there are still mrs_used.
>> and therefore also in ib_dealloc_pd that says refcount on pd is not 0.
>>
>> I'll debug it tomorrow hopefully and update.
>
> Is this a regression that happened due to your patch set?
I don't think so. Without my patches we crash.
I see that we have a timeout on admin_q, and then I/O error:
[Wed Feb 14 14:10:59 2018] nvme nvme0: I/O 0 QID 0 timeout, reset controller
[Wed Feb 14 14:10:59 2018] nvme nvme0: failed nvme_keep_alive_end_io
error=10
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 704258460
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 388820158
[Wed Feb 14 14:10:59 2018] ib_mr_pool_destroy: destroyed 121 mrs,
mrs_used 6 for qp 000000008182fc6f
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 489120554
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 399385206
[Wed Feb 14 14:10:59 2018] device-mapper: multipath: Failing path 259:0.
[Wed Feb 14 14:10:59 2018] WARNING: CPU: 9 PID: 12333 at
drivers/infiniband/core//verbs.c:1524 ib_destroy_qp+0x159/0x170 [ib_core]
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 269330912
[Wed Feb 14 14:10:59 2018] Modules linked in:
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 211936734
[Wed Feb 14 14:10:59 2018] nvme_rdma(OE)
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 383446442
[Wed Feb 14 14:10:59 2018] nvme_fabrics(OE) nvme_core(OE)
[Wed Feb 14 14:10:59 2018] print_req_error: I/O error, dev nvme0n1,
sector 160594228
for some reason not all commands complete before we destroy the QP (we
use dm-multipath here).
In iser (we also saw that the pool has registered regions) we created
all_list and we free the MRs from there...
-Max.
More information about the Linux-nvme
mailing list