[bug report] rdma_rxe doesn't work with blktests on 5.14.0-rc6
Yi Zhang
yi.zhang at redhat.com
Tue Aug 17 23:42:05 PDT 2021
On Wed, Aug 18, 2021 at 1:30 PM Zhu Yanjun <zyjzyj2000 at gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 1:09 PM Zhu Yanjun <zyjzyj2000 at gmail.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 12:57 PM Yi Zhang <yi.zhang at redhat.com> wrote:
> > >
> > > Hello
> > >
> > > I found rdma_rxe doesn't work with blktests on the latest upstream, is
> > > it a known issue, feel free to let me know if you need any info/test
> > > for it, thanks.
> > >
> > > # nvme_trtype=rdma ./check nvme/008
> > > nvme/008 (create an NVMeOF host with a block device-backed ns) [failed]
> > > runtime 0.323s ... 0.329s
> > > --- tests/nvme/008.out 2021-08-18 00:18:35.380345954 -0400
> > > +++ /root/blktests/results/nodev/nvme/008.out.bad 2021-08-18
> > > 00:35:11.126723074 -0400
> > > @@ -1,5 +1,7 @@
> > > Running nvme/008
> > > -91fdba0d-f87b-4c25-b80f-db7be1418b9e
> > > -uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
> > > -NQN:blktests-subsystem-1 disconnected 1 controller(s)
> > > +Failed to write to /dev/nvme-fabrics: Cannot allocate memory
> > > +cat: '/sys/class/nvme/nvme*/subsysnqn': No such file or directory
> > > +cat: /sys/block/n1/uuid: No such file or directory
> > > ...
> > > (Run 'diff -u tests/nvme/008.out
> > > /root/blktests/results/nodev/nvme/008.out.bad' to see the entire diff)
> > >
> > >
> > > [ 981.382774] run blktests nvme/008 at 2021-08-18 00:33:21
> > > [ 981.470796] rdma_rxe: loaded
> > > [ 981.474338] infiniband eno1_rxe: set active
> > > [ 981.474340] infiniband eno1_rxe: added eno1
> > > [ 981.476737] eno2 speed is unknown, defaulting to 1000
> > > [ 981.481803] eno2 speed is unknown, defaulting to 1000
> > > [ 981.486865] eno2 speed is unknown, defaulting to 1000
> > > [ 981.492862] infiniband eno2_rxe: set down
> > > [ 981.492864] infiniband eno2_rxe: added eno2
> > > [ 981.492904] eno2 speed is unknown, defaulting to 1000
> > > [ 981.497957] eno2 speed is unknown, defaulting to 1000
> > > [ 981.504338] eno2 speed is unknown, defaulting to 1000
> > > [ 981.510442] eno3 speed is unknown, defaulting to 1000
> > > [ 981.515509] eno3 speed is unknown, defaulting to 1000
> > > [ 981.520580] eno3 speed is unknown, defaulting to 1000
> > > [ 981.526600] infiniband eno3_rxe: set down
> > > [ 981.526601] infiniband eno3_rxe: added eno3
> > > [ 981.526640] eno3 speed is unknown, defaulting to 1000
> > > [ 981.531693] eno3 speed is unknown, defaulting to 1000
> > > [ 981.538052] eno2 speed is unknown, defaulting to 1000
> > > [ 981.543115] eno3 speed is unknown, defaulting to 1000
> > > [ 981.549088] eno4 speed is unknown, defaulting to 1000
> > > [ 981.554149] eno4 speed is unknown, defaulting to 1000
> > > [ 981.559211] eno4 speed is unknown, defaulting to 1000
> > > [ 981.565201] infiniband eno4_rxe: set down
> > > [ 981.565203] infiniband eno4_rxe: added eno4
> > > [ 981.565242] eno4 speed is unknown, defaulting to 1000
> > > [ 981.570306] eno4 speed is unknown, defaulting to 1000
> > > [ 981.576724] eno2 speed is unknown, defaulting to 1000
> > > [ 981.581785] eno3 speed is unknown, defaulting to 1000
> > > [ 981.586848] eno4 speed is unknown, defaulting to 1000
> > > [ 981.599312] loop0: detected capacity change from 0 to 2097152
> > > [ 981.612261] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> > > [ 981.614215] nvmet_rdma: enabling port 0 (10.16.221.116:4420)
> > > [ 981.622586] nvmet: creating controller 1 for subsystem
> > > blktests-subsystem-1 for NQN
> > > nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0035-4b10-8044-b9c04f463333.
> > > [ 981.622830] nvme nvme0: creating 32 I/O queues.
> > > [ 981.634860] nvme nvme0: failed to initialize MR pool sized 128 for
> > > QID 32 ----------------------- failed here
> >
> > Recently a lot of commits are merged into linux upstream, can you let
> > us know the kernel version
> > on which this problem occurs?
>
> I mean, the earliest kernel version on which this problem occurs.
It should be 5.14.-rc5, and from 5.14.0-0.rc4, the tests failed with
another failure log as below, and 5.13 doesn't have these issues.
[ 196.022199] run blktests nvme/008 at 2021-08-18 02:06:48
[ 196.107611] TECH PREVIEW: Soft-RoCE Transport Driver may not be
fully supported.
Please review provided documentation for limitations.
[ 196.121123] rdma_rxe: loaded
[ 196.127604] infiniband eno1_rxe: set active
[ 196.127606] infiniband eno1_rxe: added eno1
[ 196.130061] eno2 speed is unknown, defaulting to 1000
[ 196.135131] eno2 speed is unknown, defaulting to 1000
[ 196.140197] eno2 speed is unknown, defaulting to 1000
[ 196.146175] infiniband eno2_rxe: set down
[ 196.146176] infiniband eno2_rxe: added eno2
[ 196.146225] eno2 speed is unknown, defaulting to 1000
[ 196.151338] Loading iSCSI transport class v2.0-870.
[ 196.152607] eno2 speed is unknown, defaulting to 1000
[ 196.158580] eno3 speed is unknown, defaulting to 1000
[ 196.163661] eno3 speed is unknown, defaulting to 1000
[ 196.168724] eno3 speed is unknown, defaulting to 1000
[ 196.168783] iscsi: registered transport (iser)
[ 196.174702] infiniband eno3_rxe: set down
[ 196.174703] infiniband eno3_rxe: added eno3
[ 196.174744] eno3 speed is unknown, defaulting to 1000
[ 196.180953] eno2 speed is unknown, defaulting to 1000
[ 196.186033] eno3 speed is unknown, defaulting to 1000
[ 196.191511] Rounding down aligned max_sectors from 4294967295 to 4294967288
[ 196.191546] db_root: cannot open: /etc/target
[ 196.192018] eno4 speed is unknown, defaulting to 1000
[ 196.200971] eno4 speed is unknown, defaulting to 1000
[ 196.206033] eno4 speed is unknown, defaulting to 1000
[ 196.212008] infiniband eno4_rxe: set down
[ 196.212010] infiniband eno4_rxe: added eno4
[ 196.212050] eno4 speed is unknown, defaulting to 1000
[ 196.218428] eno2 speed is unknown, defaulting to 1000
[ 196.223494] eno3 speed is unknown, defaulting to 1000
[ 196.228562] eno4 speed is unknown, defaulting to 1000
[ 196.240599] eno2 speed is unknown, defaulting to 1000
[ 196.245656] eno3 speed is unknown, defaulting to 1000
[ 196.250719] eno4 speed is unknown, defaulting to 1000
[ 196.255853] loop: module loaded
[ 196.272763] RPC: Registered rdma transport module.
[ 196.272765] RPC: Registered rdma backchannel transport module.
[ 196.277249] loop0: detected capacity change from 0 to 2097152
[ 196.286992] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 196.289632] nvmet_rdma: enabling port 0 (10.16.221.116:4420)
[ 196.299378] nvmet_rdma: post_recv cmd failed
[ 196.303666] nvmet_rdma: nvmet_rdma_alloc_queue: creating RDMA queue
failed (-22).
[ 196.325813] rdma_rxe: invalid mask or state for qp
[ 196.330608] nvme nvme0: Connect rejected: status 28 (consumer
defined) nvme status 6 (resource not found).
[ 196.340259] nvme nvme0: rdma connection establishment failed (-104)
[ 196.450004] eno2 speed is unknown, defaulting to 1000
[ 196.455067] eno3 speed is unknown, defaulting to 1000
[ 196.460132] eno4 speed is unknown, defaulting to 1000
[ 196.475683] rdma_rxe: unloaded
More information about the Linux-nvme
mailing list