[PATCH v7 3/3] io_uring: enable per-io hinting capability
Kanchan Joshi
joshi.k at samsung.com
Thu Oct 17 07:58:03 PDT 2024
On 10/2/2024 7:56 PM, Pavel Begunkov wrote:
> On 9/30/24 19:13, Kanchan Joshi wrote:
>> With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and
>> all the subsequent writes on the file pass that hint value down.
>> This can be limiting for large files (and for block device) as all the
>> writes can be tagged with only one lifetime hint value.
>> Concurrent writes (with different hint values) are hard to manage.
>> Per-IO hinting solves that problem.
>>
>> Allow userspace to pass additional metadata in the SQE.
>> The type of passed metadata is expressed by a new field
>>
>> __u16 meta_type;
>
> The new layout looks nicer, but let me elaborate on the previous
> comment. I don't believe we should be restricting to only one
> attribute per IO. What if someone wants to pass a lifetime hint
> together with integrity information?
For that reason only I made meta_type to accept multiple bit values.
META_TYPE_LIFETIME_HINT and a new META_TYPE_INTEGRITY can coexist.
Overall 16 meta types can coexist.
> Instead, we might need something more extensible like an ability
> to pass a list / array of typed attributes / meta information / hints
> etc. An example from networking I gave last time was control messages,
> i.e. cmsg. In a basic oversimplified form the API from the user
> perspective could look like:
>
> struct meta_attr {
> u16 type;
> u64 data;
> };
>
> struct meta_attr attr[] = {{HINT, hint_value}, {INTEGRITY, ptr}};
> sqe->meta_attrs = attr;
> sqe->meta_nr = 2;
I did not feel like adding a pointer (and have copy_from_user cost) for
integrity. Currently integrity uses space in second SQE which seems fine
[*].
Down the line if meta-types increase and we are on verge of low SQE
space, we can resort to add indirect reference.
[*]
https://lore.kernel.org/linux-nvme/20241016112912.63542-8-anuj20.g@samsung.com/
More information about the Linux-nvme
mailing list