[PATCH 1/2] block: fix surprise removal for drivers calling blk_set_queue_dying

Markus Blöchl Markus.Bloechl at ipetronik.com
Wed Feb 16 10:08:50 PST 2022


On Wed, Feb 16, 2022 at 06:09:19PM +0100, Christoph Hellwig wrote:
> > I might have missed something here, but assuming I am a driver which
> > employs multiple different queues, some with a disk attached to them,
> > some without (Is that possible? The admin queue e.g.?)
> > and I just lost my connection and want to notify everything below me
> > that their connection is dead.
> > Would I really want to kill disk queues differently from non-disk
> > queues?
> 
> Yes.  Things like the admin queue in nvme are under full control of
> the driver.  While the "disk" queues just get I/O from the file system
> and thus need to be cut off.
> 
> > How is the admin queue killed? Is it even?
> 
> It isn't.  We just stop submitting to it.

Ah, It is in nvme_dev_remove_admin() so as long as we don't get stuck
ourselves before we get there, things should be fine since other tasks
waiting for blk_queue_enter() only wait until nvme_remove() is done.
> 
> > > --- a/drivers/block/mtip32xx/mtip32xx.c
> > > +++ b/drivers/block/mtip32xx/mtip32xx.c
> > > @@ -4112,7 +4112,7 @@ static void mtip_pci_remove(struct pci_dev *pdev)
> > >  			"Completion workers still active!\n");
> > >  	}
> > >  
> > > -	blk_set_queue_dying(dd->queue);
> > > +	blk_mark_disk_dead(dd->disk);
> > 
> > This driver is weird, I did find are reliably hint that dd->disk always
> > exists here. At least mtip_block_remove() has an extra check for that.
> 
> The driver is a bit of a mess indeed, but the disk and queue will be
> non-NULL if ->probe returns successfully so this is fine.  It is more
> that some of the checks are not required.
> 
> > It also only set QUEUE_FLAG_DEAD if it detects a surprise removal and
> > not QUEUE_FLAG_DYING.
> 
> Yes, this driver will need further work.

Alright, I more or less ignore this one for now, then.


I noticed that set_capacity() is also called most of the time when
a disk is killed. Should we also move that into blk_mark_disk_dead()?
Any reasons not to?




More information about the Linux-nvme mailing list