[PATCH 17/17] nvme: enable non-inline passthru commands
Jens Axboe
axboe at kernel.dk
Thu Mar 31 18:22:33 PDT 2022
On 3/22/22 11:10 AM, Kanchan Joshi wrote:
>> We need to decouple the
>> uring cmd properly. And properly in this case means not to add a
>> result pointer, but to drop the result from the _input_ structure
>> entirely, and instead optionally support a larger CQ entry that contains
>> it, just like the first patch does for the SQ.
>
> Creating a large CQE was my thought too. Gave that another stab.
> Dealing with two types of CQE felt nasty to fit in liburing's api-set
> (which is cqe-heavy).
>
> Jens: Do you already have thoughts (go/no-go) for this route?
Yes, I think we should just add support for 32-byte CQEs as well. Only
pondering I've done here is if it makes sense to manage them separately,
or if you should just get both big sqe and cqe support in one setting.
For passthrough, you'd want both. But eg for zoned writes, you can make
do with a normal sized sqes and only do larger cqes.
I did actually benchmark big sqes in peak testing, and found them to
perform about the same, no noticeable difference. Which does make sense,
as normal IO with big sqe would only touch the normal sized sqe and
leave the other one unwritten and unread. Since they are cacheline
sized, there's no extra load there.
For big cqes, that's a bit different and I'd expect a bit of a
performance hit for that. We can currently fit 4 of them into a
cacheline, with the change it'd be 2. The same number of ops/sec would
hence touch twice as many cachelines for completions.
But I still think it's way better than having to copy back part of the
completion info out-of-band vs just doing it inline, and it's more
efficient too for that case for sure.
> From all that we discussed, maybe the path forward could be this:
> - inline-cmd/big-sqe is useful if paired with big-cqe. Drop big-sqe
> for now if we cannot go the big-cqe route.
We should go big cqe for sure, it'll help clean up a bunch of things
too.
--
Jens Axboe
More information about the Linux-nvme
mailing list