nvme-tcp: kernel NULL pointer dereference, address: 0000000000000034
Sagi Grimberg
sagi at grimberg.me
Tue Mar 21 06:08:23 PDT 2023
>> Can you verify that you never see a bio-less requests is being polled?
>
> As far I can tell this is not happening anymore. I could trigger just by
> doing a
>
> nvme connect-all -P 2
>
> I've added my annotation again and don't see the combination of
>
> hxct type = 2 && bio == 0
>
> anymore:
I think I understand what was happening. blk_execute_rq calls
blk_mq_sched_insert_request and then starts polling. But because
this is a connect, that is running from a single context, it is
not running on the same cpu as the io queue was mapped to, hence
it is async.
The bio->bi_cookie assignment happens only when the request is started
(blk_mq_start_request is called), that is where the assignment happens,
but given that this is async, it can and does occur after blk_execute_rq
starts polling, and dereferencing bio->bi_cookie.
So I think we need to both untangle the need for a bdev to poll, and
a need for a bio with a stable bi_cookie. And the latter patch does
both.
More information about the Linux-nvme
mailing list