[PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
Kanchan Joshi
joshi.k at samsung.com
Thu Sep 12 08:53:35 PDT 2024
On 9/12/2024 6:31 PM, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 08:31:58PM +0530, Kanchan Joshi wrote:
>> This is similar to existing F_{SET/GET}_RW_HINT but more
>> generic/extensible.
>>
>> F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument:
>>
>> struct rw_hint_ex {
>> __u8 type;
>> __u8 pad[7];
>> __u64 val;
>> };
>>
>> With F_SET_RW_HINT_EX, the user passes the hint type and its value.
>> Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or
>> placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add
>> more hint add more hint types in future.
>
> What is the point of multiplexing these into a single call vs having
> one fcntl for each? It's not like the code points are a super
> limited resource.
Do you mean new fcntl code only for placement hint?
I thought folks will prefer the user-interface to be future proof so
that they don't have to add a new fcntl opcode.
Had the existing fcntl accepted "hint type" as argument, I would not
have resorted to add a new one now.
You may have noticed that in io_uring metadata series also, even though
current meta type is 'integrity', we allow user interface to express
other types of metadata too.
> And the _EX name isn't exactly descriptive either and screams of horrible
> Windows APIs :)
I can change to what you prefer.
But my inspiration behind this name was Linux F_GET/SET_OWN_EX (which is
revamped version of F_GET/SET_OWN).
>> + WRITE_ONCE(inode->i_write_hint, hint);
>> + if (file->f_mapping->host != inode)
>> + WRITE_ONCE(file->f_mapping->host->i_write_hint, hint);
>
> This doesn't work. You need a file system method for this so that
> the file system can intercept it, instead of storing it in completely
> arbitrary inodes without any kind of checking for support or intercetion
> point.
>
I don't understand why will it not work. The hint is being set in the
same way how it is done in the current code (in existing fcntl handlers
for temperature hints).
>> --- a/include/linux/rw_hint.h
>> +++ b/include/linux/rw_hint.h
>> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
>> static_assert(sizeof(enum rw_lifetime_hint) == 1);
>> #endif
>>
>> +#define WRITE_HINT_TYPE_BIT BIT(7)
>> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
>> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
>> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
>> +
>> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + WRITE_HINT_VAL(h) : 0)
>> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + 0 : WRITE_HINT_VAL(h))
>> +
>> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
>> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
>
> That's a whole lot of undocumented macros. Please turn these into proper
> inline functions and write documentation for them.
I can try doing that.
More information about the Linux-nvme
mailing list