[PATCH 17/17] nvme: enable non-inline passthru commands
Jens Axboe
axboe at kernel.dk
Thu Mar 31 20:05:12 PDT 2022
On 3/31/22 8:44 PM, Jens Axboe wrote:
> On 3/31/22 8:33 PM, Kanchan Joshi wrote:
>> On Fri, Apr 1, 2022 at 6:55 AM Jens Axboe <axboe at kernel.dk> wrote:
>>>
>>> On 3/30/22 7:14 AM, Kanchan Joshi wrote:
>>>> On Wed, Mar 30, 2022 at 6:32 PM Christoph Hellwig <hch at lst.de> wrote:
>>>>>
>>>>> On Fri, Mar 25, 2022 at 07:09:21PM +0530, Kanchan Joshi wrote:
>>>>>> Ok. If you are open to take new opcode/struct route, that is all we
>>>>>> require to pair with big-sqe and have this sorted. How about this -
>>>>>
>>>>> I would much, much, much prefer to support a bigger CQE. Having
>>>>> a pointer in there just creates a fair amount of overhead and
>>>>> really does not fit into the model nvme and io_uring use.
>>>>
>>>> Sure, will post the code with bigger-cqe first.
>>>
>>> I can add the support, should be pretty trivial. And do the liburing
>>> side as well, so we have a sane base.
>>
>> I will post the big-cqe based work today. It works with fio.
>> It does not deal with liburing (which seems tricky), but hopefully it
>> can help us move forward anyway .
>
> Let's compare then, since I just did the support too :-)
>
> Some limitations in what I pushed:
>
> 1) Doesn't support the inline completion path. Undecided if this is
> super important or not, the priority here for me was to not pollute the
> general completion path.
>
> 2) Doesn't support overflow. That can certainly be done, only
> complication here is that we need 2x64bit in the io_kiocb for that.
> Perhaps something can get reused for that, not impossible. But figured
> it wasn't important enough for a first run.
>
> I also did the liburing support, but haven't pushed it yet. That's
> another case where some care has to be taken to avoid makig the general
> path slower.
>
> Oh, it's here, usual branch:
>
> https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-big-sqe
>
> and based on top of the pending 5.18 bits and the current 5.19 bits.
Do post your version too, would be interesting to compare. I just wired
mine up to NOP, hasn't seen any testing beyond just verifying that we do
pass back the extra data.
Added inline completion as well. Kind of interesting in that performance
actually seems to be _better_ with CQE32 for my initial testing, just
using NOP. More testing surely needed, will run it on actual hardware
too as I have a good idea what performance should look like there.
I also think it's currently broken for request deferral and timeouts,
but those are just minor tweaks that need to be made to account for the
cq head being doubly incremented on bigger CQEs.
--
Jens Axboe
More information about the Linux-nvme
mailing list