[PATCH] nvme: Add weighted-round-robin arbitration support

Kanchan Joshi joshi.k at samsung.com
Tue Jan 16 05:56:04 PST 2018


Thanks for the review, Sagi. Yes, it seems logical to split the way you 
suggested. May I know if you, Keith (or other maintainers) would like to 
see any part(s) of this patch, in current/revised form, getting into 
current driver?
Crux of this patch was plumbing multiple sqs per nvmeq (N:1) mapping. At 
this point I am not so sure but if we are moving to block-layer 
(multiple, differentiated, hctx approach), N:1 mapping may not be needed 
in nvme driver. Rather, there could be N hctx with 1:1 mapping.

On Sunday 14 January 2018 03:26 PM, Sagi Grimberg wrote:
> Hi Joshi,
> 
>> This patch enables support for Weighted-Round-Robin (WRR) arbitration, so
>> that applications can make use of the prioritization capabilities 
>> natively
>> present in NVMe controller.
>>
>> - It links existing io-nice classes (real-time, best-effort, none, low)
>> to NVMe priorities (urgent, high, medium, low).  This is done through
>> 'request->ioprio' field inside 'queue_rq' function.
>>
>> - Current driver has 1:1 mapping (1 SQ, 1 CQ) per cpu, encapsulated in
>> 'nvmeq' structure.  This patch refactors the code so that N:1 mapping per
>> cpu can be created; 'nvmeq' has been changed to contain variable 
>> number of SQ
>> related fields.  For WRR, 4 submission-queues (corresponding to each 
>> queue
>> priorites) need to be created on each cpu.
>>
>> - When 'enable_wrr' module param is passed, it creates 4:1 mapping and 
>> enables
>> controller in WRR mode.  Otherwise, it cotinues to retain 1:1 mapping and
>> controller remains in RR mode.
>>
>> - NVMe device may have less number of queues than required for 4:1 
>> mapping
>> per cpu.  For example, when num_possible_cpus is 64, 256 
>> submission-queues are
>> required for 4:1 mapping while device may support, say, 128.
>> This case is handled by creating 32 queue-pairs which are shared among 
>> 64 cpus.
>> Another way to handle this could have been reducing to 3:1 or 2:1 mapping
>> (and remapping 4 ionice classes as well).
>>
>> -Admin queue, contains 1:1 mapping irrespective of the mode (RR or 
>> WRR) used.
> 
> Regardless the dicussion with Keihth, this patch should be divided into
> three or four preparatory patches and the wrr patch.
> 
> 1. keeping nvmeq->cq_db
> 2. changing nvme_enable_ctrl not to set ctrl_config (needs to verify
>     doesn't break anything)
> 3. keeping multiple sqs per nvmeq and plumbing the sq_index
> 4. wire up wrr
> 
> This is true also if this is moving to blk-mq
> 
> 
> 



More information about the Linux-nvme mailing list