nvme: machine check when running nvme subsystem-reset /dev/nvme0 against direct attach via PCIE slot

Mon Oct 7 08:56:09 PDT 2024

On Thu, 2024-10-03 at 15:04 -0600, Keith Busch wrote:
> On Thu, Sep 26, 2024 at 05:11:05PM -0400, Laurence Oberman wrote:
> > It was reported to Red Hat, seeing issues with using a
> > "nvme subsystem-reset /dev/nvme0" command to test resets.
> 
> I really dislike that command. The side effects are overkill for the
> pci
> transport...
>  
> > On multiple servers I tested on two types of nvme attached devices
> > These are not the rootfs devices
> > 
> > 1. The front slot (hotplug) devices in a 2.5in format 
> > reset and after some time recover (what is expected)
> > 
> > Example of one working
> > 
> > Does not trap and land up as a machine-check
> 
> <snip>
> 
> > 2. Any kernel upstream latest 6.11, RHEL8 or RHEL9 causes 
> > a machine check and panics the box when its against a nvme in a 
> > PCIE slot
> > 
> > [  263.862919] mce: [Hardware Error]: CPU 12: Machine Check
> > Exception: 5 Bank 6: ba00000000000e0b
> > [  263.862924] mce: [Hardware Error]: RIP !INEXACT!
> > 10:<ffffffff8571dce4> {intel_idle+0x54/0x90}
> 
> So this wasn't failing before 6.11? As Nilay mentioned, there are
> some
> changes on how nvme subsystem reset is handled. The main thing being
> this ioctl doesn't automatically trigger an nvme reset. I expected
> delayed recovery might happen, but machine checks are not expected.
> If
> this was working before, I can only guess right now that the previous
> behavior was accessing MMIO and config quicker and triggered a
> different
> error path. If you're successful with the PPC patch reverted, I would
> be
> interested to hear about it.
> 

Hello

Quick update about this.
I went back all the way to 6.8 and this still happens.
I started to think that these HPE servers were more susceptible to the
machine checks on the PCIE state changes.

So I tested on a Lenovo and still had panics.
I do not think this is worth pursuing given that Keith already
confirmed this is not recommended and way too heavy handed on the PCIE
path.

I have told the reporter of this that they are not to use this type of
fault injection on directly attached nvme devices.

Thanks
Laurence