[RFC PATCH] dm: fix excessive dm-mq context switching

Sagi Grimberg sagig at dev.mellanox.co.il
Sun Feb 7 08:54:54 PST 2016


>> If so, can you check with e.g.
>> perf record -ags -e LLC-load-misses sleep 10 && perf report whether this
>> workload triggers perhaps lock contention ? What you need to look for in
>> the perf output is whether any functions occupy more than 10% CPU time.
>
> I will, thanks for the tip!

The perf report is very similar to the one that started this effort..

I'm afraid we'll need to resolve the per-target m->lock in order
to scale with NUMA...

-  17.33%              fio  [kernel.kallsyms]        [k] 
queued_spin_lock_slowpath
    - queued_spin_lock_slowpath
       - 52.09% _raw_spin_lock_irq
            __multipath_map
            multipath_clone_and_map
            map_request
            dm_mq_queue_rq
            __blk_mq_run_hw_queue
            blk_mq_run_hw_queue
            blk_mq_insert_requests
            blk_mq_flush_plug_list
            blk_flush_plug_list
            blk_finish_plug
            do_io_submit
            SyS_io_submit
            entry_SYSCALL_64_fastpath
          + io_submit
       - 46.87% _raw_spin_lock_irqsave
          - 99.97% multipath_busy
               dm_mq_queue_rq
               __blk_mq_run_hw_queue
               blk_mq_run_hw_queue
               blk_mq_insert_requests
               blk_mq_flush_plug_list
               blk_flush_plug_list
               blk_finish_plug
               do_io_submit
               SyS_io_submit
               entry_SYSCALL_64_fastpath
             + io_submit
+   4.99%              fio  [kernel.kallsyms]        [k] 
blk_account_io_start
+   3.93%              fio  [dm_multipath]           [k] __multipath_map
+   2.64%              fio  [dm_multipath]           [k] multipath_busy
+   2.38%              fio  [kernel.kallsyms]        [k] 
_raw_spin_lock_irqsave
+   2.31%              fio  [dm_mod]                 [k] dm_mq_queue_rq
+   2.25%              fio  [kernel.kallsyms]        [k] 
blk_mq_hctx_mark_pending
+   1.81%              fio  [kernel.kallsyms]        [k] blk_queue_enter
+   1.61%             perf  [kernel.kallsyms]        [k] 
copy_user_generic_string
+   1.40%              fio  [kernel.kallsyms]        [k] 
__blk_mq_run_hw_queue
+   1.26%              fio  [kernel.kallsyms]        [k] part_round_stats
+   1.14%              fio  [kernel.kallsyms]        [k] _raw_spin_lock_irq
+   0.96%              fio  [kernel.kallsyms]        [k] __bt_get
+   0.73%              fio  [kernel.kallsyms]        [k] enqueue_task_fair
+   0.71%              fio  [kernel.kallsyms]        [k] enqueue_entity
+   0.69%              fio  [dm_mod]                 [k] dm_start_request
+   0.60%      ksoftirqd/6  [kernel.kallsyms]        [k] 
blk_mq_run_hw_queues
+   0.59%     ksoftirqd/10  [kernel.kallsyms]        [k] 
blk_mq_run_hw_queues
+   0.59%              fio  [kernel.kallsyms]        [k] 
_raw_spin_unlock_irqrestore
+   0.58%     ksoftirqd/19  [kernel.kallsyms]        [k] 
blk_mq_run_hw_queues
+   0.58%     ksoftirqd/18  [kernel.kallsyms]        [k] 
blk_mq_run_hw_queues
+   0.58%     ksoftirqd/23  [kernel.kallsyms]        [k] 
blk_mq_run_hw_queues




More information about the Linux-nvme mailing list