[PATCH v9 06/11] io_uring: introduce attributes for read/write and PI support
Pavel Begunkov
asml.silence at gmail.com
Thu Nov 21 07:45:37 PST 2024
On 11/21/24 08:59, Anuj Gupta wrote:
> On Mon, Nov 18, 2024 at 04:59:22PM +0000, Pavel Begunkov wrote:
>> On 11/18/24 12:50, Christoph Hellwig wrote:
>>> On Sat, Nov 16, 2024 at 12:32:25AM +0000, Pavel Begunkov wrote:
...
>> Do we have technical arguments against the direction in the last
>> suggestion? It's extendible and _very_ simple. The entire (generic)
>> handling for the bitmask approach for this set would be sth like:
>>
>> struct sqe {
>> u64 attr_type_mask;
>> u64 attr_ptr;
>> };
>> if (sqe->attr_type_mask) {
>> if (sqe->attr_type_mask != TYPE_PI)
>> return -EINVAL;
>>
>> struct uapi_pi_structure pi;
>> copy_from_user(&pi, sqe->attr_ptr, sizeof(pi));
>> hanlde_pi(&pi);
>> }
>>
>> And the user side:
>>
>> struct uapi_pi_structure pi = { ... };
>> sqe->attr_ptr = π
>> sqe->attr_type_mask = TYPE_PI;
>>
>
> How about using this, but also have the ability to keep PI inline.
> Attributes added down the line can take one of these options:
> 1. If available space in SQE/SQE128 is sufficient for keeping attribute
> fields, one can choose to keep them inline and introduce a TYPE_XYZ_INLINE
> attribute flag.
> 2. If the available space is not sufficient, pass the attribute information
> as pointer and introduce a TYPE_XYZ attribute flag.
> 3. One can choose to support a attribute via both pointer and inline scheme.
> The pointer scheme can help with scenarios where user wants to avoid SQE128
> for whatever reasons (e.g. mixed workload).
Right, the idea would work. It'd need to be not type specific but
rather a separate flag covering all attributes of a request, though.
IOW, either all of them are in user memory or all optimised. We probably
don't have a good place for a flag, but then you can just chip away a
bit from attr_type_mask as you're doing for INLINE.
enum {
TYPE_PI = 1,
...
TYPE_FLAG_INLINE = 1 << 63,
};
// sqe->attr_type_mask = TYPE_PI | TYPE_FLAG_INLINE;
Another question is whether it's better to use SQE or another mapping
like reg-wait thing does. My suggestion is, send it without the INLINE
optimisation targeting 6.14 (I assume block bits are sorted?). We'll
figure that optimisation separately and target the same release, there
is plenty of time for that.
--
Pavel Begunkov
More information about the Linux-nvme
mailing list