4.5-rc iser issues

Ming Lei ming.lei at canonical.com
Sun Feb 14 23:11:11 PST 2016


On Mon, Feb 15, 2016 at 12:29 AM, Sagi Grimberg
<sagig at dev.mellanox.co.il> wrote:
>> Hi Sagi,
>
>
> Hey,
>
>>> But I don't think simply not cloning the biovecs is the right thing
>>> to do in the end.  This must be something with the bvec iterators.
>>
>>
>> I agree with Christoph, and there might be issues somewhere.
>>
>
> Me too, it was just an isolation step...
>
>>>  From the log:
>>> iser: sg[0] dma_addr:0x85FC06000 off:0x0 sz:0x200 dma_len:0x200
>>> iser: sg[1] dma_addr:0x860334000 off:0x0 sz:0x200 dma_len:0x200 <-- gap
>>> iser: sg[2] dma_addr:0x860335000 off:0x0 sz:0x200 dma_len:0x200 <-- gap
>>
>>
>> The above gap shouldn't have come since blk_bio_segment_split() splits
>> out one new bio if gap is detected.
>>
>> Sort of the following code can be added in driver or prep_fn to check if
>> bvec of  the rq is correct:
>>
>> rq_for_each_segment(bvec, sc->request, iter) {
>>       //check if there is gap between bvec
>> }
>
>
> I added this indication and the gap detection does trigger a bio
> split.
>
>>
>> I don't know how to use iser, and looks everything works fine after
>> I setup virt boundary as 4095 for null_blk by the attachment
>> patch.
>
>
> That's probably because it's artificial and there is no HW with a real
> limitation...
>
>>
>>> Full quote for Ming:
>>>
>>> On Sun, Feb 14, 2016 at 04:02:18PM +0200, Sagi Grimberg wrote:
>>>>
>>>>
>>>>>> I'm bisecting now, there are a couple of patches from Ming in
>>>>>> the area of the bio splitting code...
>>>>>>
>>>>>> CC'ing Ming, Linux-block and Linux-nvme as iser is identical to nvme
>>>>>> wrt the virtual boundary so I think nvme will break as well.
>>
>>
>> The bisected commit is merged to v4.3, and looks no such kind of
>> report from nvme.
>
>
> I'm wandering how can that be... because clearly iser is seeing gaps
> which like nvme, it can't handle those. Maybe this is scsi specific?

I can reproduce the issue now, and it is easy to trigger it via your test
code on scsi device, but a bit difficult to get it on null_blk.

Turns out it is a block core issue, and it is in bio_will_gap() which gets
the last bvec via 'bi_io_vec[prev->bi_vcnt - 1]' directly.  I have posted
out one patchset for fixing the issue:

http://marc.info/?l=linux-kernel&m=145551975429092&w=2

Thanks,
Ming



More information about the Linux-nvme mailing list