Unexpected issues with 2 NVME initiators using the same target

Sagi Grimberg sagi at grimberg.me
Mon Mar 6 03:28:56 PST 2017


> Hi Sagi,
>
> I think we need to add fence to the UMR wqe.
>
> so lets try this one:
>
> diff --git a/drivers/infiniband/hw/mlx5/qp.c
> b/drivers/infiniband/hw/mlx5/qp.c
> index ad8a263..c38c4fa 100644
> --- a/drivers/infiniband/hw/mlx5/qp.c
> +++ b/drivers/infiniband/hw/mlx5/qp.c
> @@ -3737,8 +3737,7 @@ static void dump_wqe(struct mlx5_ib_qp *qp, int
> idx, int size_16)
>
>  static u8 get_fence(u8 fence, struct ib_send_wr *wr)
>  {
> -       if (unlikely(wr->opcode == IB_WR_LOCAL_INV &&
> -                    wr->send_flags & IB_SEND_FENCE))
> +       if (wr->opcode == IB_WR_LOCAL_INV || wr->opcode == IB_WR_REG_MR)
>                 return MLX5_FENCE_MODE_STRONG_ORDERING;
>
>         if (unlikely(fence)) {

This will kill performance, isn't there another fix that can
be applied just for retransmission flow?

> Couldn't repro that case but I run some initial tests in my Lab (with my
> patch above) - not performace servers:
>
> Initiator with 24 CPUs (2 threads/core, 6 cores/socket, 2 sockets),
> Connect IB (same driver mlx5_ib), kernel 4.10.0, fio test with 24 jobs
> and 128 iodepth.
> register_always=N
>
> Target - 1 subsystem with 1 ns (null_blk)
>
> bs   read (without/with patch)   write (without/with patch)
> --- --------------------------  ---------------------------
> 512     1019k / 1008k                 1004k / 992k
> 1k      1021k / 1013k                 1002k / 991k
> 4k      1030k / 1022k                 978k  / 969k
>
> CPU usage is 100% for both cases in the initiator side.
> haven't seen difference with bs = 16k.
> No so big drop like we would expect,

Obviously you won't see a drop without registering memory
for small IO (register_always=N), this would bypass registration
altogether... Please retest with register_always=Y.



More information about the Linux-nvme mailing list