[PATCH 0/6] block: add support for REQ_OP_VERIFY

Martin K. Petersen martin.petersen at oracle.com
Thu Dec 8 20:52:01 PST 2022


> My guess, if true, is it's rationalized with the device is already
> doing patrols in the background - why verify when it's already
> been recently patrolled?

The original SCSI VERIFY operation allowed RAID array firmware to do
background scrubs before disk drives developed anything resembling
sophisticated media management. If verification of a block failed, the
array firmware could go ahead and reconstruct the data from the rest of
the stripe in the background. This substantially reduced the risk of
having to perform block reconstruction in the hot path. And verification
did not have to burden the already slow array CPU/memory/DMA combo with
transferring every block on every attached drive.

I suspect that these days it is very hard to find a storage device that
doesn't do media management internally in the background. So from the
perspective of physically exercising the media, VERIFY is probably not
terribly useful anymore.

In that light, having to run VERIFY over the full block range of a
device to identify unreadable blocks seems like a fairly clunky
mechanism. Querying the device for a list of unrecoverable blocks
already identified by the firmware seems like a better interface.

I am not sure I understand this whole "proof that the drive did
something" requirement. If a device lies and implements VERIFY as a noop
it just means you'll get the error during a future READ operation
instead.

No matter what, a successful VERIFY is obviously no guarantee that a
future READ on a given block will be possible. But it doesn't matter
because the useful outcome of a verify operation is the failure, not the
success. It's the verification failure scenario which allows you to take
a corrective action.

If you really want to verify device VERIFY implementation, we do have
WRITE UNCORRECTABLE commands in both SCSI and NVMe which allow you to do
that. But I think device validation is a secondary issue. The more
pertinent question is whether we have use cases in the kernel (MD,
btrfs) which would benefit from being able to preemptively identify
unreadable blocks?

-- 
Martin K. Petersen	Oracle Linux Engineering



More information about the Linux-nvme mailing list