[PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
Christoph Hellwig
hch at lst.de
Thu Sep 12 06:01:46 PDT 2024
On Tue, Sep 10, 2024 at 08:31:58PM +0530, Kanchan Joshi wrote:
> This is similar to existing F_{SET/GET}_RW_HINT but more
> generic/extensible.
>
> F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument:
>
> struct rw_hint_ex {
> __u8 type;
> __u8 pad[7];
> __u64 val;
> };
>
> With F_SET_RW_HINT_EX, the user passes the hint type and its value.
> Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or
> placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add
> more hint add more hint types in future.
What is the point of multiplexing these into a single call vs having
one fcntl for each? It's not like the code points are a super
limited resource.
And the _EX name isn't exactly descriptive either and screams of horrible
Windows APIs :)
> + WRITE_ONCE(inode->i_write_hint, hint);
> + if (file->f_mapping->host != inode)
> + WRITE_ONCE(file->f_mapping->host->i_write_hint, hint);
This doesn't work. You need a file system method for this so that
the file system can intercept it, instead of storing it in completely
arbitrary inodes without any kind of checking for support or intercetion
point.
> --- a/include/linux/rw_hint.h
> +++ b/include/linux/rw_hint.h
> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
> static_assert(sizeof(enum rw_lifetime_hint) == 1);
> #endif
>
> +#define WRITE_HINT_TYPE_BIT BIT(7)
> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
> +
> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + WRITE_HINT_VAL(h) : 0)
> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + 0 : WRITE_HINT_VAL(h))
> +
> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
That's a whole lot of undocumented macros. Please turn these into proper
inline functions and write documentation for them.
More information about the Linux-nvme
mailing list