On 9/12/2024 6:31 PM, Christoph Hellwig wrote: > On Tue, Sep 10, 2024 at 08:31:58PM +0530, Kanchan Joshi wrote: >> This is similar to existing F_{SET/GET}_RW_HINT but more >> generic/extensible. >> >> F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument: >> >> struct rw_hint_ex { >> __u8 type; >> __u8 pad[7]; >> __u64 val; >> }; >> >> With F_SET_RW_HINT_EX, the user passes the hint type and its value. >> Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or >> placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add >> more hint add more hint types in future. > > What is the point of multiplexing these into a single call vs having > one fcntl for each? It's not like the code points are a super > limited resource. Do you mean new fcntl code only for placement hint? I thought folks will prefer the user-interface to be future proof so that they don't have to add a new fcntl opcode. Had the existing fcntl accepted "hint type" as argument, I would not have resorted to add a new one now. You may have noticed that in io_uring metadata series also, even though current meta type is 'integrity', we allow user interface to express other types of metadata too. > And the _EX name isn't exactly descriptive either and screams of horrible > Windows APIs :) I can change to what you prefer. But my inspiration behind this name was Linux F_GET/SET_OWN_EX (which is revamped version of F_GET/SET_OWN). >> + WRITE_ONCE(inode->i_write_hint, hint); >> + if (file->f_mapping->host != inode) >> + WRITE_ONCE(file->f_mapping->host->i_write_hint, hint); > > This doesn't work. You need a file system method for this so that > the file system can intercept it, instead of storing it in completely > arbitrary inodes without any kind of checking for support or intercetion > point. > I don't understand why will it not work. The hint is being set in the same way how it is done in the current code (in existing fcntl handlers for temperature hints). >> --- a/include/linux/rw_hint.h >> +++ b/include/linux/rw_hint.h >> @@ -21,4 +21,17 @@ enum rw_lifetime_hint { >> static_assert(sizeof(enum rw_lifetime_hint) == 1); >> #endif >> >> +#define WRITE_HINT_TYPE_BIT BIT(7) >> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1) >> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \ >> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT) >> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK) >> + >> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \ >> + WRITE_HINT_VAL(h) : 0) >> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \ >> + 0 : WRITE_HINT_VAL(h)) >> + >> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT >> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1) > > That's a whole lot of undocumented macros. Please turn these into proper > inline functions and write documentation for them. I can try doing that.