[PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write

Christoph Hellwig hch at lst.de
Tue Nov 5 21:29:28 PST 2024


On Tue, Nov 05, 2024 at 09:23:19AM -0700, Keith Busch wrote:
> > > The SQE128 requirement is only for PI type.
> > > Another different meta type may just fit into the first SQE. For that we 
> > > don't have to mandate SQE128.
> > 
> > Ok, I'm really confused now.  The way I understood Anuj was that this
> > is NOT about block level metadata, but about other uses of the big SQE.
> > 
> > Which version is right?  Or did I just completely misunderstand Anuj?
> 
> Let's not call this "meta_type". Can we use something that has a less
> overloaded meaning, like "sqe_extended_capabilities", or "ecap", or
> something like that.

So it's just a flag that a 128-byte SQE is used?  Don't we know that
implicitly from the sq?

> >  - a flag that a pointer to metadata is passed.  This can work with
> >    a 64-bit SQE.
> >  - another flag that a PI tuple is passed.  This requires a 128-byte
> >    and also the previous flag.
> 
> I don't think anything done so far aligns with what Pavel had in mind.
> Let me try to lay out what I think he's going for. Just bare with me,
> this is just a hypothetical example.
> 
>   This patch adds a PI extension.
>   Later, let's say write streams needs another extenion.
>   Then key per-IO wants another extention.
>   Then someone else adds wizbang-awesome-feature extention.
> 
> Let's say you have device that can do all 4, or any combination of them.
> Pavel wants a solution that is future proof to such a scenario. So not
> just a single new "meta_type" with its structure, but a list of types in
> no particular order, and their structures.

But why do we need the type at all?  Each of them obvious needs two
things:

 1) some space to actually store the extra fields
 2) a flag that the additional values are passed

any single value is not going to help with supporting arbitrary
combinations, because well, you can can mix and match, and you need
space for all them even if you are not using all of them.




More information about the Linux-nvme mailing list