nvme-tcp: kernel NULL pointer dereference, address: 0000000000000034

Sagi Grimberg sagi at grimberg.me
Tue Mar 21 06:08:23 PDT 2023


>> Can you verify that you never see a bio-less requests is being polled?
> 
> As far I can tell this is not happening anymore. I could trigger just by
> doing a
> 
>     nvme connect-all -P 2
> 
> I've added my annotation again and don't see the combination of
> 
>    hxct type = 2 && bio == 0
> 
> anymore:

I think I understand what was happening. blk_execute_rq calls
blk_mq_sched_insert_request and then starts polling. But because
this is a connect, that is running from a single context, it is
not running on the same cpu as the io queue was mapped to, hence
it is async.

The bio->bi_cookie assignment happens only when the request is started
(blk_mq_start_request is called), that is where the assignment happens,
but given that this is async, it can and does occur after blk_execute_rq
starts polling, and dereferencing bio->bi_cookie.

So I think we need to both untangle the need for a bdev to poll, and
a need for a bio with a stable bi_cookie. And the latter patch does
both.



More information about the Linux-nvme mailing list