[PATCH v2] nvme: check for valid data from from nvme_identify_ns() before using it
Ewan Milne
emilne at redhat.com
Tue Nov 28 08:10:29 PST 2023
On Mon, Nov 27, 2023 at 5:03 PM Keith Busch <kbusch at kernel.org> wrote:
>
> On Mon, Nov 27, 2023 at 03:56:57PM -0500, Ewan D. Milne wrote:
> > When scanning namespaces, it is possible to get valid data from the first
> > call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
> > call in nvme_update_ns_info_block(). In particular, if the NSID becomes
> > inactive between the two commands, a storage device may return a buffer
> > filled with zero as per 4.1.5.1. In this case, we can get a kernel crash
> > due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
> > be set to zero.
> >
> > PID: 326 TASK: ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10"
> > #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
> > #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
> > #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
> > #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
> > #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
> > #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
> > #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
> > #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
> > [exception RIP: blk_stack_limits+434]
> > RIP: ffffffff92191872 RSP: ffffad8f8702fc80 RFLAGS: 00010246
> > RAX: 0000000000000000 RBX: ffff95efa0c91800 RCX: 0000000000000001
> > RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
> > RBP: 00000000ffffffff R8: ffff95fec7df35a8 R9: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> > R13: 0000000000000000 R14: 0000000000000000 R15: ffff95fed33c09a8
> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
> > #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]
> >
> > This happened when the check for valid data was moved out of nvme_identify_ns()
> > into one of the callers. Fix this by checking in both callers.
> >
> > v2: call kfree() on nvme_id_ns struct in error path
> >
> > Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
> > Cc: stable at vger.kernel.org
> > Signed-off-by: Ewan D. Milne <emilne at redhat.com>
>
> Thanks, applied for nvme-6.7.
>
> Interestingly enough, I think this is the same as what was recently
> reported here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=218186
>
Yes, from the stack trace and BZ comments it looks like it.
-Ewan
More information about the Linux-nvme
mailing list