nvme-host: disk corruptions when issuing IDENTIFY commands via ioctl()
Maurizio Lombardi
mlombard at redhat.com
Tue Mar 8 08:45:20 PST 2022
Hello,
I recently received a bug report complaining about disk corruptions when
issuing a NVME_IOCTL_ADMIN_CMD / IDENTIFY ioctl() with cmd.data_len =
8192 bytes and the buffer address not aligned to the page size.
This is the C program that we used to reproduce the issue (tested with
5.17.0-rc6): http://bsdbackstore.it/misc/nvme_ioctl_512.c
simply run it by passing a path to an nvme device:
./nvme_ioctl_512 /dev/nvme0n1
It appears to be very unpredictable. Sometimes I hit disk corruptions
after a few tries, sometimes it takes hours. Sometimes the ioctl()
returns success and sometimes it fails.
We suspect that the root cause is that the nvme-host driver doesn't
enforce the 4096 byte limit for the IDENTIFY commands as the
nvme-target does (see the nvmet_execute_identify() -->
nvmet_check_transfer_len(req, NVME_IDENTIFY_DATA_SIZE) code).
So if we pass a 8192-byte buffer not aligned to the page size, it will
need 3 pages on archs where page size is 4k and the nvme spec says
that the data buffer may not cross more than one page boundary.
Does it make sense to you? What's your opinion on this?
Thanks,
Maurizio Lombardi
More information about the Linux-nvme
mailing list