[PATCH v3] nvme: fix memory corruption for passthrough metadata

Keith Busch kbusch at kernel.org
Thu Oct 12 08:31:38 PDT 2023


On Thu, Oct 12, 2023 at 06:36:52AM +0200, Christoph Hellwig wrote:
> On Wed, Oct 11, 2023 at 11:04:58AM -0600, Keith Busch wrote:
> 
> > I don't think it's reasonable for the driver to decode every passthrough
> > command to validate the data lengths, or reject ones that we don't know
> > how to decode. SG_IO doesn't do that either.
> 
> I don't want that either, but what can we do against a (possibly
> unprivileged) user corrupting data?

The unpriviledged access is kind of recent. Maybe limit the scope of
decoding to that usage?

We've always known the interface can be misused to corrupt memory and/or
data, and it was always user responsibility to use this interface
reponsibly. We shouldn't disable something people have relied on for
over 10 years just because someone rediscovered ways to break it.

It's not like this is a "metadata" specific thing either; you can
provide short user space buffers and corrupt memory with regular admin
commands, and we have been able to that from day 1. But if you abuse
this interface, it was always your fault; the kernel never took
responsibility to sanity check your nvme input, and I think it's a bad
precedent to start doing it.
 
> SCSI has it much either because it has an explicit data transfer length
> (outside the CDB) instead of trying to build it from information that
> differs per opcode.  One of the many places where it shows that NVMe
> is a very sloppy and badly thought out protocol.

Yeah, implicit PRP length has often been reported as one of the early
protocol "regrets"...



More information about the Linux-nvme mailing list