[PATCH] nvme: add DIX support for nvme-rdma
Max Gurtovoy
mgurtovoy at nvidia.com
Mon Aug 29 16:47:32 PDT 2022
On 8/29/2022 6:10 PM, Keith Busch wrote:
> On Mon, Aug 29, 2022 at 05:56:39PM +0300, Max Gurtovoy wrote:
>> On 8/29/2022 4:16 PM, Chao Leng wrote:
>>> On 2022/8/29 18:43, Max Gurtovoy wrote:
>>>> On 8/29/2022 11:12 AM, Chao Leng wrote:
>>>>
>>>> You can mention that also in iser, if supported, the default is to
>>>> use IP_CHECKSUM for DIX and not CRC.
>>> According to DIX define:DIX = IP_CHECKSUM.
>>> To reduce CPU utilization, the end-to-end DIF for SCSI protocols is
>>> DIX-DIF when supported by hardware.
>> From what I re-call DIX was protection between host_buff -> host_device and
>> DIF was protection between host_device -> target_device.
>>
>> If now its defined as DIX == IP_CHECKSUM and DIF == CRC please mention it
>> somehow in the commit message.
> Where is this coming from? The NVMe command set spec says this is the
> difference between DIF and DIX:
>
> The primary difference between these two mechanisms is the location of the
> protection information. In DIF, the protection information is contiguous with
> the logical block data and creates an extended logical block, while in DIX,
> the protection information is stored in a separate buffer.
>
> Regarding CRC vs IP Checksum, the spec also says this:
>
> In addition to a CRC-16, DIX also specifies an optional IP checksum that is
> not supported by the NVM Express interface.
Not so clear what does that mean.
>
> So DIX support doesn't imply IP checksum. Even if the host device can support
> it, the target device can not report it uses that guard type.
Right.
But we don't send the IP checksum guard on the wire.
The implementation is only used for integrity between host buffer <-->
local HBA.
And the fabrics support only DIF (extended logical block) so IP checksum
guard is not allowed.
So, I suggest re-write the commit message according to the NVMe spec
(that defined DIX and DIF differently than SCSI) + add perf numbers for
4k, 8k, 16k, 32k, 64k, 128k, 258k IO read + IO write.
Or maybe mention only the IP checksum that become default, if supported,
for offloaded integrity between host buffer <--> local HBA (As we do in
iSER).
More information about the Linux-nvme
mailing list