BUG at IP: blk_mq_get_request+0x23e/0x390 on 4.16.0-rc7
Ming Lei
ming.lei at redhat.com
Sun Apr 8 03:48:08 PDT 2018
On Sun, Apr 08, 2018 at 06:44:33PM +0800, Ming Lei wrote:
> On Sun, Apr 08, 2018 at 01:36:27PM +0300, Sagi Grimberg wrote:
> >
> > > Hi Sagi
> > >
> > > Still can reproduce this issue with the change:
> >
> > Thanks for validating Yi,
> >
> > Would it be possible to test the following:
> > --
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 75336848f7a7..81ced3096433 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -444,6 +444,10 @@ struct request *blk_mq_alloc_request_hctx(struct
> > request_queue *q,
> > return ERR_PTR(-EXDEV);
> > }
> > cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask);
> > + if (cpu >= nr_cpu_ids) {
> > + pr_warn("no online cpu for hctx %d\n", hctx_idx);
> > + cpu = cpumask_first(alloc_data.hctx->cpumask);
> > + }
> > alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
> >
> > rq = blk_mq_get_request(q, NULL, op, &alloc_data);
> > --
> > ...
> >
> >
> > > [ 153.384977] BUG: unable to handle kernel paging request at
> > > 00003a9ed053bd48
> > > [ 153.393197] IP: blk_mq_get_request+0x23e/0x390
> >
> > Also would it be possible to provide gdb output of:
> >
> > l *(blk_mq_get_request+0x23e)
>
> nvmf_connect_io_queue() is used in this way by asking blk-mq to allocate
> request from one specific hw queue, but there may not be all online CPUs
> mapped to this hw queue.
And the following patchset may fail this kind of allocation and avoid
the kernel oops.
https://marc.info/?l=linux-block&m=152318091025252&w=2
Thanks,
Ming
More information about the Linux-nvme
mailing list