Unexpected issues with 2 NVME initiators using the same target

Fri Mar 17 11:37:07 PDT 2017

> -----Original Message-----
> From: Max Gurtovoy [mailto:maxg at mellanox.com]
> 
> I think we need to add fence to the UMR wqe.
> 
> diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
> index ad8a263..c38c4fa 100644
> --- a/drivers/infiniband/hw/mlx5/qp.c
> +++ b/drivers/infiniband/hw/mlx5/qp.c
> @@ -3737,8 +3737,7 @@ static void dump_wqe(struct mlx5_ib_qp *qp, int idx,
> int size_16)
> 
>   static u8 get_fence(u8 fence, struct ib_send_wr *wr)
>   {
> -       if (unlikely(wr->opcode == IB_WR_LOCAL_INV &&
> -                    wr->send_flags & IB_SEND_FENCE))
> +       if (wr->opcode == IB_WR_LOCAL_INV || wr->opcode == IB_WR_REG_MR)
>                  return MLX5_FENCE_MODE_STRONG_ORDERING;
> 
>          if (unlikely(fence)) {
> 
> Joseph,
> please update after trying the 2 patches (seperatly) + perf numbers.
> 
> I'll take it internally and run some more tests with stronger servers using
> ConnectX4 NICs.
> 
> These patches are only for testing and not for submission yet. If we find them
> good enought for upstream then we need to distinguish between ConnexcX4/IB
> and ConnectX5 (we probably won't see it there).

Hi Max-

Our testing on this patch looks good, failures seem completely alleviated.  We are not really detecting any performance impact to small block read workloads.  Data below uses 50Gb CX4 initiator and target and FIO to generate load.  Each disk runs 4KB random reads with 4 jobs and queue depth 32 per job.  Initiator uses 16 IO queues per attached subsystem.  We tested with 2 P3520 disks attached, and again with 7 disks attached.

				IOPS		Latency (usec)
4.10-RC8	2 disks		545,695		466.0 
With Patch	2 disks		587,663		432.8
4.10-RC8	7 disks		1,074,311	829.5
With Patch	7 disks		1,080,099	825.4

You mention these patches are only for testing.  How do we get to something which can be submitted to upstream?

Thanks!