[PATCH v9 06/11] io_uring: introduce attributes for read/write and PI support
Anuj Gupta
anuj20.g at samsung.com
Thu Nov 21 00:59:35 PST 2024
On Mon, Nov 18, 2024 at 04:59:22PM +0000, Pavel Begunkov wrote:
> On 11/18/24 12:50, Christoph Hellwig wrote:
> > On Sat, Nov 16, 2024 at 12:32:25AM +0000, Pavel Begunkov wrote:
> > > We can also reuse your idea from your previous iterations and
> > > use the bitmap to list all attributes. Then preamble and
> > > the explicit attr_type field are not needed, type checking
> > > in the loop is removed and packing is better. And just
> > > by looking at the map we can calculate the size of the
> > > array and remove all size checks in the loop.
> >
> > Can we please stop overdesigning the f**k out of this? Really,
>
> Please stop it, it doesn't add weight to your argument. The design
> requirement has never changed, at least not during this patchset
> iterations.
>
> > either we're fine using the space in the extended SQE, or
> > we're fine using a separate opcode, or if we really have to just
> > make it uring_cmd. But stop making thing being extensible for
> > the sake of being extensible.
>
> It's asked to be extendible because there is a good chance it'll need to
> be extended, and no, I'm not suggesting anyone to implement the entire
> thing, only PI bits is fine.
>
> And no, it doesn't have to be "this or that" while there are other
> options suggested for consideration. And the problem with the SQE128
> option is not even about SQE128 but how it's placed inside, i.e.
> at a fixed spot.
>
> Do we have technical arguments against the direction in the last
> suggestion? It's extendible and _very_ simple. The entire (generic)
> handling for the bitmask approach for this set would be sth like:
>
> struct sqe {
> u64 attr_type_mask;
> u64 attr_ptr;
> };
> if (sqe->attr_type_mask) {
> if (sqe->attr_type_mask != TYPE_PI)
> return -EINVAL;
>
> struct uapi_pi_structure pi;
> copy_from_user(&pi, sqe->attr_ptr, sizeof(pi));
> hanlde_pi(&pi);
> }
>
> And the user side:
>
> struct uapi_pi_structure pi = { ... };
> sqe->attr_ptr = π
> sqe->attr_type_mask = TYPE_PI;
>
How about using this, but also have the ability to keep PI inline.
Attributes added down the line can take one of these options:
1. If available space in SQE/SQE128 is sufficient for keeping attribute
fields, one can choose to keep them inline and introduce a TYPE_XYZ_INLINE
attribute flag.
2. If the available space is not sufficient, pass the attribute information
as pointer and introduce a TYPE_XYZ attribute flag.
3. One can choose to support a attribute via both pointer and inline scheme.
The pointer scheme can help with scenarios where user wants to avoid SQE128
for whatever reasons (e.g. mixed workload).
Something like this:
// uapi/.../io_uring.h
struct sqe {
...
u64 attr_type_mask;
u64 attr_ptr;
};
// kernel space
if (sqe->attr_type_mask) {
struct uapi_pi_struct pi, *piptr;
if ((sqe->attr_type_mask & TYPE_PI_INLINE) &&
(sqe->attr_type_mask & TYPE_PI))
return -EINVAL;
/* inline PI case */
if (sqe->attr_type_mask & TYPE_PI_INLINE) {
if (!(ctx->flags & IORING_SETUP_SQE128))
return -EINVAL;
piptr = (sqe + 1);
} else if (sqe->attr_type_mask & TYPE_PI) {
/* indirect PI case */
copy_from_user(&pi, sqe->attr_ptr, sizeof(pi));
piptr = π
}
handle_pi(piptr);
}
And the user side:
// send PI using pointer:
struct uapi_pi_structure pi = { ... };
sqe->attr_ptr = π
sqe->attr_type_mask = TYPE_PI;
// send PI inline:
/* setup a big-sqe ring */
struct uapi_pi_structure *pi = (sqe+1);
prepare_pi(pi);
sqe->attr_type_mask = TYPE_PI_INLINE;
More information about the Linux-nvme
mailing list