Unexpected issues with 2 NVME initiators using the same target
Sagi Grimberg
sagi at grimberg.me
Mon Mar 6 03:28:56 PST 2017
> Hi Sagi,
>
> I think we need to add fence to the UMR wqe.
>
> so lets try this one:
>
> diff --git a/drivers/infiniband/hw/mlx5/qp.c
> b/drivers/infiniband/hw/mlx5/qp.c
> index ad8a263..c38c4fa 100644
> --- a/drivers/infiniband/hw/mlx5/qp.c
> +++ b/drivers/infiniband/hw/mlx5/qp.c
> @@ -3737,8 +3737,7 @@ static void dump_wqe(struct mlx5_ib_qp *qp, int
> idx, int size_16)
>
> static u8 get_fence(u8 fence, struct ib_send_wr *wr)
> {
> - if (unlikely(wr->opcode == IB_WR_LOCAL_INV &&
> - wr->send_flags & IB_SEND_FENCE))
> + if (wr->opcode == IB_WR_LOCAL_INV || wr->opcode == IB_WR_REG_MR)
> return MLX5_FENCE_MODE_STRONG_ORDERING;
>
> if (unlikely(fence)) {
This will kill performance, isn't there another fix that can
be applied just for retransmission flow?
> Couldn't repro that case but I run some initial tests in my Lab (with my
> patch above) - not performace servers:
>
> Initiator with 24 CPUs (2 threads/core, 6 cores/socket, 2 sockets),
> Connect IB (same driver mlx5_ib), kernel 4.10.0, fio test with 24 jobs
> and 128 iodepth.
> register_always=N
>
> Target - 1 subsystem with 1 ns (null_blk)
>
> bs read (without/with patch) write (without/with patch)
> --- -------------------------- ---------------------------
> 512 1019k / 1008k 1004k / 992k
> 1k 1021k / 1013k 1002k / 991k
> 4k 1030k / 1022k 978k / 969k
>
> CPU usage is 100% for both cases in the initiator side.
> haven't seen difference with bs = 16k.
> No so big drop like we would expect,
Obviously you won't see a drop without registering memory
for small IO (register_always=N), this would bypass registration
altogether... Please retest with register_always=Y.
More information about the Linux-nvme
mailing list