On Tue, Dec 19, 2023 at 09:15:52AM +0200, Amir Goldstein wrote: > On Mon, Dec 18, 2023 at 11:57 PM Vinicius Costa Gomes > <vinicius.gomes@xxxxxxxxx> wrote: > > > > Christian Brauner <brauner@xxxxxxxxxx> writes: > > > > >> > Yes, the important thing is that an object cannot change > > >> > its non_refcount property during its lifetime - > > >> > > >> ... which means that put_creds_ref() should assert that > > >> there is only a single refcount - the one handed out by > > >> prepare_creds_ref() before removing non_refcount or > > >> directly freeing the cred object. > > >> > > >> I must say that the semantics of making a non-refcounted copy > > >> to an object whose lifetime is managed by the caller sounds a lot > > >> less confusing to me. > > > > > > So can't we do an override_creds() variant that is effectively just: > > Yes, I think that we can.... > > > > > > > /* caller guarantees lifetime of @new */ > > > const struct cred *foo_override_cred(const struct cred *new) > > > { > > > const struct cred *old = current->cred; > > > rcu_assign_pointer(current->cred, new); > > > return old; > > > } > > > > > > /* caller guarantees lifetime of @old */ > > > void foo_revert_creds(const struct cred *old) > > > { > > > const struct cred *override = current->cred; > > > rcu_assign_pointer(current->cred, old); > > > } > > > > > Even better(?), we can do this in the actual guard helpers to > discourage use without a guard: > > struct override_cred { > struct cred *cred; > }; > > DEFINE_GUARD(override_cred, struct override_cred *, > override_cred_save(_T), > override_cred_restore(_T)); > > ... > > void override_cred_save(struct override_cred *new) > { > new->cred = rcu_replace_pointer(current->cred, new->cred, true); > } > > void override_cred_restore(struct override_cred *old) > { > rcu_assign_pointer(current->cred, old->cred); > } The main thing we want is that it's somewhat clear that it's special purpose interface (Sometimes I jokingly feel we should have include/linux/quirky_overlayfs_helpers.h or actually working module specific exports so we can export a helper to only a single module. Whatever happened to that?). If you do the cred guard thing then maybe name it: {override,revert}_cred_light() and then use them to implement the replace portion for: {override,revert}_cred(). Yes, the {override,revert}_cred() naming isn't optimal but unless we rename them as well to *_{save,restore} I don't see the point in making the new helpers deviate from that pattern. They basically do the same thing. So my point is to just let them mirror the naming in __fget_light(). To a regular VFS developer the *_light() will give away that it probably doesn't take a reference. But I'm not married to that. So I'd probably just do something like the following COMPLETELY UNTESTED AND UNCOMPILED thing: diff --git a/include/linux/cred.h b/include/linux/cred.h index 2976f534a7a3..c975eb47e691 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -165,6 +165,24 @@ extern int cred_fscmp(const struct cred *, const struct cred *); extern void __init cred_init(void); extern int set_cred_ucounts(struct cred *); +/* + * Override creds without bumping reference count. Caller must ensure + * reference remains valid or has taken reference. Almost always not the + * interface you want. Use override_creds()/revert_creds() instead. + */ +#define override_creds_light(override_cred) \ + ({ \ + const struct cred *__old_cred = current->cred; \ + rcu_assign_pointer(current->cred, override_cred); \ + __old_cred; \ + }) + +#define revert_creds_light(revert_cred) \ + rcu_assign_pointer(current->cred, revert_cred); + +DEFINE_GUARD(cred, struct cred *, override_creds_light(_T), + revert_creds_light(_T)); + static inline bool cap_ambient_invariant_ok(const struct cred *cred) { return cap_issubset(cred->cap_ambient, diff --git a/kernel/cred.c b/kernel/cred.c index c033a201c808..d6713edaee37 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -485,7 +485,7 @@ EXPORT_SYMBOL(abort_creds); */ const struct cred *override_creds(const struct cred *new) { - const struct cred *old = current->cred; + const struct cred *old; kdebug("override_creds(%p{%ld})", new, atomic_long_read(&new->usage)); @@ -499,8 +499,7 @@ const struct cred *override_creds(const struct cred *new) * visible to other threads under RCU. */ get_new_cred((struct cred *)new); - rcu_assign_pointer(current->cred, new); - + old = override_creds_light(new); kdebug("override_creds() = %p{%ld}", old, atomic_long_read(&old->usage)); return old; @@ -521,7 +520,7 @@ void revert_creds(const struct cred *old) kdebug("revert_creds(%p{%ld})", old, atomic_long_read(&old->usage)); - rcu_assign_pointer(current->cred, old); + revert_creds_light(old); put_cred(override); } EXPORT_SYMBOL(revert_creds); > > > > Maybe I really fail to understand this problem or the proposed solution: > > > the single reference that overlayfs keeps in ovl->creator_cred is tied > > > to the lifetime of the overlayfs superblock, no? And anyone who needs a > > > long term cred reference e.g, file->f_cred will take it's own reference > > > anyway. So it should be safe to just keep that reference alive until > > > overlayfs is unmounted, no? I'm sure it's something quite obvious why > > > that doesn't work but I'm just not seeing it currently. > > > > My read of the code says that what you are proposing should work. (what > > I am seeing is that in the "optimized" cases, the only practical effect > > of override/revert is the rcu_assign_pointer() dance) > > > > I guess that the question becomes: Do we want this property (that the > > 'cred' associated with a subperblock/similar is long lived and the > > "inner" refcount can be omitted) to be encoded in the constructor? Or do > > we want it to be "encoded" in a call by call basis? > > > > Neither. > > Christian's proposal does not involve marking the cred object as > long lived, which looks a much better idea to me. > > The performance issues you observed are (probably) due to get/put > of cred refcount in the helpers {override,revert}_creds(). Most likely they are. I don't see what else would be expensive. But I may lack details. > > Christian suggested lightweight variants of {override,revert}_creds() > that do not change refcount. Combining those with a guard and > I don't see what can go wrong (TM). Place a nice comment explaining lifetime expectations in the commit message. Then someone can always tell us why we're wrong.