[PATCH 2/2] nvme: fix the placement of set_current_state to TASK_INTERRUPTIBLE

Matthew Wilcox willy at linux.intel.com
Tue Feb 19 10:05:38 EST 2013


On Mon, Feb 04, 2013 at 02:45:41PM -0800, Arjan van de Ven wrote:
> the nvme driver has a kthread that processes certain events periodically.
> However, the kthread also gets woken for certain urgent actions.
> 
> The current code does not use the current_state logic correctly;
> it calls set_current_state(TASK_INTERRUPTIBLE); right before doing
> a schedule_timeout(), with the result that there is a race condition
> where a wakeup can get lost, and thus delayed by one second.
> 
> This patch moves the set_current_state(TASK_INTERRUPTIBLE); to before
> the place where the queue of outstanding work is checked, to close
> the race condition

The thing is, that's not a queue of outstanding work, that's the list
of devices that exist in the system.  Your patch makes the kthread do
all of its work in the TASK_INTERRUPTIBLE state, which I don't think is
right either (is it?)

We could add a flag to indicate that there's urgent work to be done
(checked after setting TASK_INTERRUPTIBLE), or just live with the
occasional race.

Alternatively, we could get rid of the kthread in favour of timers
for cancelling commands and a workqueue for dealing with urgent work.
I have this on my long-term todo list, but in the absence of having
hardware to test with on large-scale systems, I didn't want to make such
a large change.



More information about the Linux-nvme mailing list