smartctl "kills" specific drives on kernel 5.13 but works fine on 5.15 - why?

Keith Busch kbusch at kernel.org
Thu Sep 15 15:24:42 PDT 2022


On Thu, Sep 15, 2022 at 03:48:01PM -0500, Nick Neumann wrote:
> On Thu, Sep 15, 2022 at 3:03 PM Keith Busch <kbusch at kernel.org> wrote:
> > Not sure what MDTS has to do with this. The error log was originally defined to
> > be a max 4k size which is below the smallest possible MDTS.
> >
> > My guess is smartclt tricked the driver into allocating a PRP List, but the
> > controller instead accessed it as a PRP entry, which could corrupt memory or
> > fail the transaction if data direction is enforced by the memory controller.
> > Why that causes the nvme controller to fail as you've described is weird,
> > though.
> 
> I definitely don't know this stuff very well - the smartctl bug
> commentary was referencing the nvme-cli commit where log pages are
> transferred in 4k chunks to avoid having to worry about exceeding the
> MDTS value. The problematic drives have error logs larger than 4K.
> 
> I believe the logic in the smartctl commentary was along the lines of
> "well, the MDTS is large enough that we should be able to transfer
> more than 4k at a time, but we're currently crashing. And nvme-cli
> does it 4k at a time always, and if we change to that, the crash goes
> away, so let's do that."
> 
> As to the allocation, smartctl calls into nvme with
> nvme_admin_get_log_page and passes a buffer (that smartctl allocates)
> of size n * sizeof(nvme_error_log_page), where n is the number of
> error log entries it is trying to read. The fix in smarmontools moved
> from trying to read all of the error log entries at once via a single
> call to nvme_adming_get_log_page, to doing 4K bytes at a time.
> 
> Not sure how helpful any of that is; it's where my current understanding is at.

If 'n' is > 64, then that would tell the driver to allocate a PRP list, and I
am pretty sure based on the observations that the drive believes the address is
a PRP entry. That doesn't readily explain why your ssd became unresponsive
immediately after dispatching the command, but the drive definitely sounds
broken.



More information about the Linux-nvme mailing list