kernel null pointer at nvme_tcp_init_iter+0x7d/0xd0 [nvme_tcp]
Sagi Grimberg
sagi at grimberg.me
Tue Feb 9 05:09:58 EST 2021
>>>> Yi, can you try the following patch?
>>>>
>>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>>> index 881d28eb15e9..a393d99b74e1 100644
>>>> --- a/drivers/nvme/host/tcp.c
>>>> +++ b/drivers/nvme/host/tcp.c
>>>> @@ -239,9 +239,14 @@ static void nvme_tcp_init_iter(struct nvme_tcp_request *req,
>>>> offset = 0;
>>>> } else {
>>>> struct bio *bio = req->curr_bio;
>>>> + struct bio_vec bv;
>>>> + struct bvec_iter iter;
>>>> +
>>>> + nsegs = 0;
>>>> + bio_for_each_bvec(bv, bio, iter)
>>>> + nsegs++;
>>>> vec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
>>>> - nsegs = bio_segments(bio);
>>>
>>> This was exactly the patch that caused the issue.
>>
>> What was the issue you are talking about? Any link or commit hash?
>>
>> nvme-tcp builds iov_iter(BVEC) from __bvec_iter_bvec(), the segment
>> number has to be the actual bvec number. But bio_segment() just returns
>> number of the single-page segment, which is wrong for iov_iter.
>>
>> Please see the same usage in lo_rw_aio().
>>
> That what I have suggested but I've also suggested the memory allocation part which Sagi explained why it is better to avoid.
>
> In my opinion we should at least try bvec calculation in lo_aio_rw() and see the problem can be fixed or not, unless reverting the commit it right approach for some reason.
I'm trying to understand what this is, but I'm failing to reproduce
this. I may ask Yi to add some debug code for this.
More information about the Linux-nvme
mailing list