kernel null pointer at nvme_tcp_init_iter+0x7d/0xd0 [nvme_tcp]

Sagi Grimberg sagi at grimberg.me
Tue Feb 9 02:21:53 EST 2021



On 2/8/21 8:21 PM, Ming Lei wrote:
> On Mon, Feb 08, 2021 at 10:42:28AM -0800, Sagi Grimberg wrote:
>>
>>>> Hi Sagi
>>>>
>>>> On 2/8/21 5:46 PM, Sagi Grimberg wrote:
>>>>>
>>>>>> Hello
>>>>>>
>>>>>> We found this kernel NULL pointer issue with latest
>>>>>> linux-block/for-next and it's 100% reproduced, let me know
>>>>>> if you need more info/testing, thanks
>>>>>>
>>>>>> Kernel repo:
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
>>>>>> Commit: 11f8b6fd0db9 - Merge branch 'for-5.12/io_uring' into for-next
>>>>>>
>>>>>> Reproducer: blktests nvme-tcp/012
>>>>>
>>>>> Thanks for reporting Ming, I've tried to reproduce this on my VM
>>>>> but did not succeed. Given that you have it 100% reproducible,
>>>>> can you try to revert commit:
>>>>>
>>>>> 0dc9edaf80ea nvme-tcp: pass multipage bvec to request iov_iter
>>>>>
>>>>
>>>> Revert this commit fixed the issue and I've attached the config. :)
>>>
>>> Good to know,
>>>
>>> I see some differences that I should probably change to hit this:
>>> -- 
>>> @@ -254,14 +256,15 @@ CONFIG_PERF_EVENTS=y
>>>    # end of Kernel Performance Events And Counters
>>>
>>>    CONFIG_VM_EVENT_COUNTERS=y
>>> +CONFIG_SLUB_DEBUG=y
>>>    # CONFIG_COMPAT_BRK is not set
>>> -CONFIG_SLAB=y
>>> -# CONFIG_SLUB is not set
>>> -# CONFIG_SLOB is not set
>>> -CONFIG_SLAB_MERGE_DEFAULT=y
>>> -# CONFIG_SLAB_FREELIST_RANDOM is not set
>>> +# CONFIG_SLAB is not set
>>> +CONFIG_SLUB=y
>>> +# CONFIG_SLAB_MERGE_DEFAULT is not set
>>> +CONFIG_SLAB_FREELIST_RANDOM=y
>>>    # CONFIG_SLAB_FREELIST_HARDENED is not set
>>> -# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set
>>> +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
>>> +CONFIG_SLUB_CPU_PARTIAL=y
>>>    CONFIG_SYSTEM_DATA_VERIFICATION=y
>>>    CONFIG_PROFILING=y
>>>    CONFIG_TRACEPOINTS=y
>>> @@ -299,7 +302,8 @@ CONFIG_HAVE_INTEL_TXT=y
>>>    CONFIG_X86_64_SMP=y
>>>    CONFIG_ARCH_SUPPORTS_UPROBES=y
>>>    CONFIG_FIX_EARLYCON_MEM=y
>>> -CONFIG_PGTABLE_LEVELS=4
>>> +CONFIG_DYNAMIC_PHYSICAL_MASK=y
>>> +CONFIG_PGTABLE_LEVELS=5
>>>    CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
>>> -- 
>>>
>>> Probably CONFIG_SLUB and CONFIG_SLUB_DEBUG should be used.
>>
>> Used your profile and this still does not happen :(
> 
> One obvious error is that nr_segments is computed wrong.
> 
> Yi, can you try the following patch?
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 881d28eb15e9..a393d99b74e1 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -239,9 +239,14 @@ static void nvme_tcp_init_iter(struct nvme_tcp_request *req,
>   		offset = 0;
>   	} else {
>   		struct bio *bio = req->curr_bio;
> +		struct bio_vec bv;
> +		struct bvec_iter iter;
> +
> +		nsegs = 0;
> +		bio_for_each_bvec(bv, bio, iter)
> +			nsegs++;
>   
>   		vec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
> -		nsegs = bio_segments(bio);

This was exactly the patch that caused the issue.



More information about the Linux-nvme mailing list