QUEUE_FLAG_NO_SG_MERGE and non-block-mq
Jens Axboe
axboe at fb.com
Fri Nov 27 08:14:08 PST 2015
On 11/27/2015 07:29 AM, Hannes Reinecke wrote:
> On 11/26/2015 10:21 AM, Ming Lei wrote:
>> On Thu, Nov 26, 2015 at 4:13 PM, Hannes Reinecke <hare at suse.de> wrote:
>>> Hi all,
>>>
>>> while investigating the crash in scsi_lib.c I found a rather curious
>>> behaviour for QUEUE_FLAG_NO_SG_MERGE.
>>>
>>> While the flag is evaluated in blk_recalc_rq_segments and
>>> blk_recount_segments (resulting in nr_phys_segments being
>>> computed based on that flag) it is completely ignored
>>> during blk_rq_map_sg() or the actual merging itself.
>>
>> Yes, I guess Jens introduced the flag for decreasing CPU
>> consumption when comuputing segments, but it is still
>> ignored by blk_rq_map_sg(), but it may not be used
>> by some drivers.
>>
>> After bio splitting is introduced, the flag is also ignored
>> when computing segments.
>>
>>>
>>> This typically shouldn't be an issue, seeing that with
>>> QUEUE_FLAG_NO_SG_MERGE nr_phys_segments will always be
>>> larger than the actual segment count.
>>>
>>> However, it still makes me wonder:
>>> What is the point of having a QUEUE_FLAG_NO_SG_MERGE
>>> which doesn't work as advertised?
>>> Or, to be precise, which only works for blk-mq?
>>> Should we make it work for non-block-mq, too?
>>
>> Thanks bio splitting, this flag has little effect on performance now,
>> so I think it can be removed if Jens has no objection.
>>
> As per your suggestion we've made some performance measurements,
> and 4k fio showed little if no impact:
>
> NO_SG_MERGE:
> IOPS R/W: 148097.7+-125.7 / 148124.1+-123.1
> BW R/W: 592392.4+-502.7 / 592498.3+-492.3
> SG_MERGE:
> IOPS R/W: 148054.4+-123.3 / 148082.6+-120.0
> BW R/W: 592219.2+-493.5 / 592332.3+-479.7
>
> So the performance benefit lies squarely within the
> error margin, making me wonder if it's worth bothering
> with having the NO_SG_MERGE flag at all.
>
> Thanks to Johannes for doing the measurements :-)
150K iops is on the slow side, though. It's pointless to iterate the sg
list if we don't have to. I can try and run a few tests next week.
--
Jens Axboe
More information about the Linux-nvme
mailing list