[PATCH 1/8] blk-mq: add blk_mq_max_nr_hw_queues()

Ming Lei ming.lei at redhat.com
Wed Jul 12 06:31:24 PDT 2023


On Wed, Jul 12, 2023 at 03:19:25PM +0200, Christoph Hellwig wrote:
> On Wed, Jul 12, 2023 at 09:16:11PM +0800, Ming Lei wrote:
> > The problem is that blk_mq_alloc_tag_set() forces to set nr_hw_queues
> > as 1 for kdump kernel, that is why blk_mq_max_nr_hw_queues() has to
> > return 1 for kdump kernel.
> 
> Well, let's fix that first and work from there.  Same argument against
> that deep magic applies there as well.

In short, driver needs to figure out nr_hw_queues first by hardware info,
then pass it to blk_mq_alloc_tag_set(), but blk_mq_alloc_tag_set() changes it,
so inconsistency is caused.

The only solution in this way is to tell driver the max supported
number from the beginning, that is what this patchset is doing.

> 
> > Thomas, can we disable managed irq for kdump kernel and switch to
> > non-managed irq? Then we can avoid driver's change. I'd suggest
> > this way if it is possible.
> 
> Why the heck would we?

IMO irq kernel doesn't make sense in kdump kernel, which is very
resource limited and has to be reliable.

PCI_IRQ_AFFINITY can be just one hint, pci_alloc_irq_vectors_affinity()
still allocates affinity in managed way, then queue mapping can work
just fine, and the only difference is that genirq handles this irqs
as non-manged wrt. migration.

This way should solve queue mapping issue, but driver still allocates
lots of queues, which take resource useless. So looks we still have to
fix drivers.


Thanks, 
Ming




More information about the Linux-nvme mailing list