[PATCH v7 06/10] io_uring/rw: add support to send metadata along with read/write
Christoph Hellwig
hch at lst.de
Tue Nov 5 08:00:51 PST 2024
On Tue, Nov 05, 2024 at 09:21:27PM +0530, Kanchan Joshi wrote:
> Can add the documentation (if this version is palatable for Jens/Pavel),
> but this was discussed in previous iteration:
>
> 1. Each meta type may have different space requirement in SQE.
>
> Only for PI, we need so much space that we can't fit that in first SQE.
> The SQE128 requirement is only for PI type.
> Another different meta type may just fit into the first SQE. For that we
> don't have to mandate SQE128.
Ok, I'm really confused now. The way I understood Anuj was that this
is NOT about block level metadata, but about other uses of the big SQE.
Which version is right? Or did I just completely misunderstand Anuj?
> 2. If two meta types are known not to co-exist, they can be kept in the
> same place within SQE. Since each meta-type is a flag, we can check what
> combinations are valid within io_uring and throw the error in case of
> incompatibility.
And this sounds like what you refer to is not actually block metadata
as in this patchset or nvme, (or weirdly enough integrity in the block
layer code).
> 3. Previous version was relying on SQE128 flag. If user set the ring
> that way, it is assumed that PI information was sent.
> This is more explicitly conveyed now - if user passed META_TYPE_PI flag,
> it has sent the PI. This comment in the code:
>
> + /* if sqe->meta_type is META_TYPE_PI, last 32 bytes are for PI */
> + union {
>
> If this flag is not passed, parsing of second SQE is skipped, which is
> the current behavior as now also one can send regular (non pi)
> read/write on SQE128 ring.
And while I don't understand how this threads in with the previous
statements, this makes sense. If you only want to send a pointer (+len)
to metadata you can use the normal 64-byte SQE. If you want to send
a PI tuple you need SEQ128. Is that what the various above statements
try to express? If so the right API to me would be to have two flags:
- a flag that a pointer to metadata is passed. This can work with
a 64-bit SQE.
- another flag that a PI tuple is passed. This requires a 128-byte
and also the previous flag.
>
>
>
>
>
---end quoted text---
More information about the Linux-nvme
mailing list