[PATCH 2/3] nvme-multipath: cannot disconnect controller on stuck partition scan

Keith Busch kbusch at kernel.org
Wed Oct 9 09:33:21 PDT 2024


On Wed, Oct 09, 2024 at 08:23:45AM +0200, Hannes Reinecke wrote:
> On 10/8/24 22:41, Keith Busch wrote:
> > On Tue, Oct 08, 2024 at 09:17:47AM +0200, Hannes Reinecke wrote:
> > > Hmm. Sadly, not really. Still stuck in partition scanning.
> > > 
> > > Guess we'll need to check if we have paths available before triggering
> > > partition scan; let me check ...
> > 
> > I think you must have two paths, and the 2nd path cleared the suppress
> > bit before the 1st one finished its called to device_add_disk(). How
> > about this one?
> > 
> Nope. Still stuck, this time in bdev_disc_changed().
> With my testcase _all_ paths return NS_NOT_READY during partition scan, so
> I/O is constantly bounced between paths, and partition scan never returns.
> Doesn't matter where you call it, it's stuck.

Oh right... We need the requeue_work to end the bio's when the disk is
dead, but the work can't proceed because it's waiting on its own bio's.
Darn, I guess this scheme would need yet another work_queue to do it.
 
> The only chance we have is to modify I/O handling during scanning
> (cf my new patchset).

I hoped we could avoid special error handling paths, but I'll take
another look at that approach.



More information about the Linux-nvme mailing list