[PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
Bart Van Assche
bart.vanassche at sandisk.com
Wed Oct 5 14:08:57 PDT 2016
On 10/05/2016 12:11 PM, Sagi Grimberg wrote:
> I was referring to weather we can take srcu in the submission path
> conditional of the hctx being STOPPED?
Regarding run-time overhead:
* rcu_read_lock() is a no-op on CONFIG_PREEMPT_NONE kernels and is
translated into preempt_disable() with preemption enabled. The latter
function modifies a per-cpu variable.
* Checking BLK_MQ_S_STOPPED before taking an rcu or srcu lock is only
safe if the BLK_MQ_S_STOPPED flag is tested in such a way that the
compiler is told to reread the hctx flags (READ_ONCE()) and if the
compiler and CPU are told not to reorder test_bit() with the
memory accesses in (s)rcu_read_lock(). To avoid races
BLK_MQ_S_STOPPED will have to be tested a second time after the lock
has been obtained, similar to the double-checked-locking pattern.
* srcu_read_lock() reads a word from the srcu structure, disables
preemption, calls __srcu_read_lock() and re-enables preemption. The
latter function increments two CPU-local variables and triggers a
memory barrier (smp_mp()).
Swapping srcu_read_lock() and the BLK_MQ_S_STOPPED flag test will make
the code more complicated. Going back to the implementation that calls
rcu_read_lock() if .queue_rq() won't sleep will result in an
implementation that is easier to read and to verify. If I overlooked
something, please let me know.
More information about the Linux-nvme