[PATCH] nvme: add DIX support for nvme-rdma
Chao Leng
lengchao at huawei.com
Tue Aug 30 05:21:29 PDT 2022
On 2022/8/30 10:38, Martin K. Petersen wrote:
>
> Max,
>
>>> According to DIX define:DIX = IP_CHECKSUM.
>>> To reduce CPU utilization, the end-to-end DIF for SCSI protocols is
>>> DIX-DIF when supported by hardware.
>>
>> From what I re-call DIX was protection between host_buff ->
>> host_device and DIF was protection between host_device ->
>> target_device.
>
> DIX is a specification for a SCSI host adapter interface which describes
> how to put the protection information in a different buffer from the
> data buffer.
>
> The optional IP checksum guard tag was an artifact of the DIX efforts
> predating CPUs having suitable CRC calculation offload. We simply
> couldn't calculate the T10 DIF CRC fast enough on a general purpose CPU
> in 2006.
>
> Now that most modern processors (x86_64, ARM) support pclmulqdq or
> similar, IP checksum support is pretty much obsolete.
From the test result, Checksum still significantly reduces CPU
utilization compared with CRC, though the modern processors
can work well with CRC.
The host CPU:Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz 48 processors.
The test result without patch:
test item IOPS sys(CPU usage)% CPU total usage%
Single concurrency 8k read 30694 0.4 0.7
Single concurrency 8k write 20088 0.3 0.6
Single concurrency 256k read 4945 0.9 1.0
Single concurrency 256k write 4672 0.7 0.9
32 concurrency 8k read 421108 6.9 11.3
32 concurrency 8k write 288861 4.3 8.5
32 concurrency 256k read 20215 2.9 3.2
32 concurrency 256k write 19627 3.1 4.6
The test result after the patch is applied:
test item IOPS sys(CPU usage)% CPU total usage%
Single concurrency 8k read 30950 0.4 0.7
Single concurrency 8k write 24325 0.3 0.6
Single concurrency 256k read 6919 0.5 0.6
Single concurrency 256k write 5477 0.4 0.7
32 concurrency 8k read 442294 6.3 11.4
32 concurrency 8k write 297841 3.5 8.2
32 concurrency 256k read 20915 1.9 2.5
32 concurrency 256k write 19814 1.8 3.3
>
> That said, I don't have a problem with permitting IP checksum use for
> NVMe RDMA adapters if the hardware is capable. But it would be good to
> get some supporting benchmarks. Plus of course a description of the
> performance vs. data integrity trade-off wrt. using the weaker IP
> checksum.
Checksum just be used between host_buff -> host HBA, and the time
is very short. If hardware support this, it is useful for reducing
CPU utilization and data security can be acceptable like SCSI.
>
More information about the Linux-nvme
mailing list