[LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers
Bart Van Assche
Bart.VanAssche at sandisk.com
Thu Jan 12 11:13:17 PST 2017
On Thu, 2017-01-12 at 10:41 +0200, Sagi Grimberg wrote:
> First, when the nvme device fires an interrupt, the driver consumes
> the completion(s) from the interrupt (usually there will be some more
> completions waiting in the cq by the time the host start processing it).
> With irq-poll, we disable further interrupts and schedule soft-irq for
> processing, which if at all, improve the completions per interrupt
> utilization (because it takes slightly longer before processing the cq).
>
> Moreover, irq-poll is budgeting the completion queue processing which is
> important for a couple of reasons.
>
> 1. it prevents hard-irq context abuse like we do today. if other cpu
> cores are pounding with more submissions on the same queue, we might
> get into a hard-lockup (which I've seen happening).
>
> 2. irq-poll maintains fairness between devices by correctly budgeting
> the processing of different completions queues that share the same
> affinity. This can become crucial when working with multiple nvme
> devices, each has multiple io queues that share the same IRQ
> assignment.
>
> 3. It reduces (or at least should reduce) the overall number of
> interrupts in the system because we only enable interrupts again
> when the completion queue is completely processed.
>
> So overall, I think it's very useful for nvme and other modern HBAs,
> but unfortunately, other than solving (1), I wasn't able to see
> performance improvement but rather a slight regression, but I can't
> explain where its coming from...
Hello Sagi,
Thank you for the additional clarification. Although I am not sure whether
irq-poll is the ideal solution for the problems that has been described
above, I agree that it would help to discuss this topic further during
LSF/MM.
Bart.
More information about the Linux-nvme
mailing list