[PATCH] nvme/scsi: Consider LBA format in IO splitting calculation

Jens Axboe axboe at kernel.dk
Mon Apr 24 21:32:13 PDT 2017


On 04/24/2017 09:27 PM, Damien Le Moal wrote:
> Jon,
> 
> On 4/25/17 13:00, Jens Axboe wrote:
>> On Mon, Apr 24 2017, Jon Derrick wrote:
>>> The current command submission code uses a sector-based value when
>>> considering the maximum number of blocks per command. With a
>>> 4k-formatted namespace and a command exceeding max hardware limits, this
>>> calculation doesn't split IOs which should be split and fails in the
>>> nvme layer. This patch fixes that calculation and enables IO splitting
>>> in these circumstances.
>>>
>>> Signed-off-by: Jon Derrick <jonathan.derrick at intel.com>
>>> ---
>>>  drivers/nvme/host/scsi.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/scsi.c b/drivers/nvme/host/scsi.c
>>> index f49ae27..988da61 100644
>>> --- a/drivers/nvme/host/scsi.c
>>> +++ b/drivers/nvme/host/scsi.c
>>> @@ -1609,7 +1609,7 @@ static int nvme_trans_do_nvme_io(struct nvme_ns *ns, struct sg_io_hdr *hdr,
>>>  	struct nvme_command c;
>>>  	u8 opcode = (is_write ? nvme_cmd_write : nvme_cmd_read);
>>>  	u16 control;
>>> -	u32 max_blocks = queue_max_hw_sectors(ns->queue);
>>> +	u32 max_blocks = queue_max_hw_sectors(ns->queue) >> (ns->lba_shift - 9);
>>>  
>>>  	num_cmds = nvme_trans_io_get_num_cmds(hdr, cdb_info, max_blocks);
>>
>> Patch looks correct to me, as we always consider the hw sectors settings
>> in units of 512b blocks.
>>
>> Reviewed-by: Jens Axboe <axboe at fb.com>
> 
> May be replace 9 with SECTOR_SHIFT ?
> 
> Jens,
> 
> I just realized that this macro is defined in linux/device-mapper.h,
> which does not seem like to best place to have it. Why not blkdev.h ?
> Any particular reason ? This leads to some strange include dependencies,
> like many nfs/blocklayout/ files including device-mapper.h just to get
> that definition.

I'm fine with moving it and using it everywhere. Right now we don't in
the block core at all, 9 is always hard coded. While that change would
be fine, it should be done independently of this patch, which is a real
bug fix.

-- 
Jens Axboe




More information about the Linux-nvme mailing list