[PATCH] nvmet-tcp: enable optional queue idle period tracking

Thu Feb 18 13:38:25 EST 2021

>> +/*
>> + * Define a time period (in msecs)

>Is this in msecs or usecs? Anyway I think it should be in the name as well.

Ah, yes, that should be usecs.  Will update the name (queue_idle_period_usecs)?

>> + * queue before determining it to be idle.  This optional module 
>> +behavior
>> + * can enable NIC solutions that support socket optimized packet 
>> +processing
>> + * using advanced interrupt moderation techniques.
>> + */
>> +static int queue_idle_period;
>> +module_param(queue_idle_period, int, 0644); 
>> +MODULE_PARM_DESC(queue_idle_period, "nvmet tcp io_work queue idle 
>> +time period");

>I think this should be called "queue_poll_period" maybe? because this is the period a queue will keep actively trying to read data from the socket regardless if there is any activity or not right?

>Or does this mean a different semantics with this?

The idea is to actively sample a queue until a cumulative period of time is experienced with no activity, the 'idle' period.  The starting time for the time period always reset if any activity during the current period is experienced.  The goal being to continually extend the sample period before being forced back into legacy interrupt mode.  So no, the time period was not intended to be a limit regardless of activity or not.

>> +
>>   #define NVMET_TCP_RECV_BUDGET		8
>>   #define NVMET_TCP_SEND_BUDGET		8
> >  #define NVMET_TCP_IO_WORK_BUDGET	64
>> @@ -96,6 +106,7 @@ struct nvmet_tcp_queue {
>>   	struct work_struct	io_work;
>>   	struct nvmet_cq		nvme_cq;
>>   	struct nvmet_sq		nvme_sq;
>> +	unsigned long           idle_period;

>poll_period?

So if you agree with my explanation above for the time period as the measurement of a 'idle' period of time with no activity, then I think keeping this as idle_period makes sense. Agree?

>> +	if (queue_idle_period) {
>> +		/* Assign the queue an idle period if not already set */
>> +		if (!queue->idle_period)
>> +			queue->idle_period =
>> +					usecs_to_jiffies(queue_idle_period);
>> +		idle_deadline = jiffies + queue->idle_period;

>Why do you need to keep a period time?

Because the io_work loop was written to exit on any iteration with no activity, assuming to provide for fairness with other queue workers that share the same core, the idle period of time must support many passes in and out of the work item for that queue by the kworker to preserve legacy behavior.  And only wanted to deduct the actual time while in the work item from the total time period, not any time that some other worker was taking on that shared core.

I suppose one could ignore this condition, but then the time allotted to a queue is less deterministic as the number of those sharing the core increases/changes.  During testing I didn't feel we could ignore this condition, what do you think?

>Why don't you just maintain queue->poll_start and just check if your past the queue_poll time addition:

>	if (queue_idle_period && queue->start_poll == 0)
>		queue->start_poll = jiffies;

>...
>[io_work loop]
>...

>	if (queue_poll_period) {
>		if (!time_after(jiffies, queue->start_poll +
>				usecs_to_jiffies(queue_poll_period)))
>			pending = true;
>		else
>			queue->start_poll = 0;
>	}

>Can this work?
If we were to not adjust or account for time while not in the work item, this logic might work.  And if we wanted to restart the idle period upon any activity, then in the 'if' case after setting pending=true would need to add check ' if (ops) then queue->start_poll = 0; '

>> +	}
>>   
>>   	do {
>>   		pending = false;
>> @@ -1222,8 +1263,30 @@ static void nvmet_tcp_io_work(struct 
>> work_struct *w)
>>   
>>   	} while (pending && ops < NVMET_TCP_IO_WORK_BUDGET);
>>   
>> +	/* If optional deadline mode active, determine if queue has reached its
>> +	 * idle process deadline limit.  Remaining deadline is calculated. Any ops
>> +	 * activity awards the queue a new deadline period.
>> +	 */

>Comments should stay consistent in style...

Got it.  assume the form above is OK?

>> +	if (queue_idle_period) {
>> +		/*
>> +		 * Clear to award active non-idle queue new period, or
>> +		 * reset for future queue activity after exit when idle reached.
>> +		 */
>> +		queue->idle_period = 0;
>> +		if (ops > 0) {
>> +			pending = true;

>What is this useful for?

This ensures that if there was any activity recorded during the total io_work loop, not just the last iteration, then the work item will be re-queued.  And because there was activity recorded then we need not process the current idle deadline because it will be renewed the next time into the work item with a new idle start period.  Maybe improved comment above this code required?

>> +		} else if (!time_after(jiffies, idle_deadline)) {
>> +			period_remaining = (long)(idle_deadline - jiffies);
>> +			if (period_remaining > 0) {
>> +				pending = true;
>> +				queue->idle_period = period_remaining;
>> +			}
>> +		}
>> +	}