[PATCH] nvme-pci: fix queue unquiesce check on slot_reset

Keith Busch kbusch at kernel.org
Mon Apr 28 08:22:55 PDT 2025


On Fri, Apr 25, 2025 at 08:33:28AM -0600, Keith Busch wrote:
> On Fri, Apr 25, 2025 at 03:19:35PM +0200, Christoph Hellwig wrote:
> > On Thu, Apr 24, 2025 at 10:18:01AM -0700, Keith Busch wrote:
> > > From: Keith Busch <kbusch at kernel.org>
> > > 
> > > A zero return means the reset was successfully scheduled. We don't want
> > > to unquiesce the queues while the reset_work is pending, as that will
> > > just flush out requeued requests to a failed completion.
> > > 
> > > Fixes: 71a5bb153be104 ("nvme: ensure disabling pairs with unquiesce")
> > 
> > Sounds like this code path isn't get teste all the much if this stuck
> > around for so long..
> 
> The conditions that trigger pcie errors were the primary concern. Of
> course you'll get IO errors, right? The pcie connection is flakey! But
> we are supposed retry IO's after recovery, which we weren't doing, and
> that was a secondary concern I embaressingly overlooked for many
> reports.

We can still take this right? It is a good fix, even if we misunderstood
why IO was failing for over a year there.



More information about the Linux-nvme mailing list