New warning `nvme nvme0: using unchecked data buffer`

Keith Busch kbusch at kernel.org
Mon Dec 2 07:49:44 PST 2024


On Mon, Dec 02, 2024 at 04:15:03PM +0100, Paul Menzel wrote:
> Am 02.12.24 um 16:05 schrieb Keith Busch:
> > On Mon, Dec 02, 2024 at 08:56:03AM +0100, Paul Menzel wrote:
> > > Am 18.11.24 um 16:57 schrieb Keith Busch:
> > > > From: Keith Busch <kbusch at kernel.org>
> > > > 
> > > > If the device supports SGLs, use these for all user requests. This
> > > > format encodes the expected transfer length so it can catch short buffer
> > > > errors in a user command, whether it occurred accidently or maliciously.
> > > > 
> > > > For controllers that support SGL data mode, this is a viable mitigation
> > > > to CVE-2023-6238. For controllers that don't support SGL, log a warning
> > > 
> > > For the layman, what is this security problem?
> > 
> > The passthrough interface can't validate buffer lengths against the
> > command's actual payload. NVMe traditionally did not have explicit
> > buffer sizes encoded in commands, so this only works correctly if the
> > device and host both agree on what the implicit transfer size actually
> > is. More recent NVMe features fixed that problem with explicit buffer
> > sizes in the commands.
> > 
> > Whether by accident or on purpose, user space can request a smaller
> > buffer than the device is going to transfer into it. That will corrupt
> > memory.
> 
> Does the Linux kernel know the buffer size?

Not necessarily. The kernel just knows what the user requested. There
are some commands that the kernel could validate to make sure what the
user requested makes sense for the command it is sending, but there are
vendor specific commands and command sets that the kernel has no idea
how to decode, as well as new spec features always being added that
change how to decode existing commands, so we just have to trust the
user isn't abusing the interface.
 
> > >      [   14.399238] nvme nvme0: using unchecked data buffer
> > > 
> > > What should a user do about it?
> > 
> > Nothing for a user to do. This is an indication that the passthrough
> > interface has been used with a device that can only use implicit
> > transfer lengths. It's more of an indication that improper use of this
> > interface might be the cause of memory corruption observations.
> 
> Could it be fixed by a firmware update?

Sure, it's just a spec feature that a firmware update could enable. But
I suspect devices supporting the feature also have hardware capabilities
to make it faster too, so I'm not sure if implementating in firmware is
desirable for all devices.
 
> I wonder if the level should be reduced then to info, or if it can be
> elaborated. Maybe:
> 
> The PC300 NVMe SK hynix 512GB can only use implicit transfer length.
> Improper use might be the cause of memory corruption observations.

There was a proposal to lock the interface down to known commands, which
I am absolutely against doing. I'm open to changing the log level or
message text to something better, though.



More information about the Linux-nvme mailing list