NVMe over RDMA latency

Thu Jul 14 09:43:03 PDT 2016

On Wed, Jul 13, 2016 at 11:25 AM, Ming Lin <mlin at kernel.org> wrote:

>> 1. I imagine you are not polling in the host but rather interrupt
>>     driven correct? thats a latency source.
>
> It's polling.
>
> root at host:~# cat /sys/block/nvme0n1/queue/io_poll
> 1
>
>>
>> 2. the target code is polling if the block device supports it. can you
>>     confirm that is indeed the case?
>
> Yes.
>
>>
>> 3. mlx4 has a strong fencing policy for memory registration, which we
>>     always do. thats a latency source. can you try with
>>     register_always=0?
>
> root at host:~# cat /sys/module/nvme_rdma/parameters/register_always
> N
>
>
>>
>> 4. IRQ affinity assignments. if the sqe is submitted on cpu core X and
>>     the completion comes to cpu core Y, we will consume some latency
>>     with the context-switch of waiking up fio on cpu core X. Is this
>>     a possible case?
>
> Only 1 CPU online on both host and target machine.
>

Since the above tunables can be easily toggled on/off, could you break
down their contributions to the overall latency with each individual
tunable ? e.g. only do io_poll on / off to see how much it improves
the latency.

>From your data, it seems to indicate the local performance on the
target got worse. Is this perception correct ?

Before the tunable: the target avg=22.35 usec
After the tunable: the target avg=23.59 usec

I'm particularly interested in the local target device latency with
io_poll on vs. off. Did you keep your p99.99 latency and p90.00
latency numbers from this experiment that can be share ?

Thanks,
Wendy