[PATCH 1/2] block: fix surprise removal for drivers calling blk_set_queue_dying
Christoph Hellwig
hch at lst.de
Wed Feb 16 09:09:19 PST 2022
On Wed, Feb 16, 2022 at 04:49:50PM +0100, Markus Blöchl wrote:
> > - blk_queue_flag_set(QUEUE_FLAG_DYING, q);
> > - blk_queue_start_drain(q);
> > + set_bit(GD_DEAD, &disk->state);
> > + blk_queue_start_drain(disk->queue);
> > }
> > -EXPORT_SYMBOL_GPL(blk_set_queue_dying);
> > +EXPORT_SYMBOL_GPL(blk_mark_disk_dead);
>
> I might have missed something here, but assuming I am a driver which
> employs multiple different queues, some with a disk attached to them,
> some without (Is that possible? The admin queue e.g.?)
> and I just lost my connection and want to notify everything below me
> that their connection is dead.
> Would I really want to kill disk queues differently from non-disk
> queues?
Yes. Things like the admin queue in nvme are under full control of
the driver. While the "disk" queues just get I/O from the file system
and thus need to be cut off.
> How is the admin queue killed? Is it even?
It isn't. We just stop submitting to it.
> > --- a/drivers/block/mtip32xx/mtip32xx.c
> > +++ b/drivers/block/mtip32xx/mtip32xx.c
> > @@ -4112,7 +4112,7 @@ static void mtip_pci_remove(struct pci_dev *pdev)
> > "Completion workers still active!\n");
> > }
> >
> > - blk_set_queue_dying(dd->queue);
> > + blk_mark_disk_dead(dd->disk);
>
> This driver is weird, I did find are reliably hint that dd->disk always
> exists here. At least mtip_block_remove() has an extra check for that.
The driver is a bit of a mess indeed, but the disk and queue will be
non-NULL if ->probe returns successfully so this is fine. It is more
that some of the checks are not required.
> It also only set QUEUE_FLAG_DEAD if it detects a surprise removal and
> not QUEUE_FLAG_DYING.
Yes, this driver will need further work.
More information about the Linux-nvme
mailing list