On 2025-01-28 10:31:47 [-1000], Tejun Heo wrote: > Hello, Hi, > Mostly look great to me. Left mostly minor comments. > > On Tue, Jan 28, 2025 at 09:42:25AM +0100, Sebastian Andrzej Siewior wrote: > > @@ -947,10 +947,20 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of, > > return 0; > > } > > > > +static void *rdt_get_kn_parent_priv(struct kernfs_node *kn) > > +{ > > nit: Rename rdt_kn_parent_priv() to be consistent with other accessors? Oh, indeed. > > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c > > index 5a1fea414996e..16d268345e3b7 100644 > > --- a/fs/kernfs/dir.c > > +++ b/fs/kernfs/dir.c > > @@ -64,9 +64,9 @@ static size_t kernfs_depth(struct kernfs_node *from, struct kernfs_node *to) > > { > > size_t depth = 0; > > > > - while (to->parent && to != from) { > > + while (rcu_dereference(to->__parent) && to != from) { > > Why not use kernfs_parent() here and other places? Because it is from within RCU section and the other checks are not required. If you prefer this instead, I sure can update it. > > @@ -226,6 +227,7 @@ int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from, > > unsigned long flags; > > int ret; > > > > + guard(rcu)(); > > Doesn't irqsave imply rcu? hmm. It kind of does based on the current implementation but it is not obvious. We had RCU-sched and RCU which got merged. From then on, the (implied) preempt-off part of IRQSAVE should imply RCU (section). It is good to be obvious about RCU. Also, rcu_dereference() will complain about missing RCU annotation. On PREEMPT_RT rcu_dereference_sched() will complain because irqsave (in this case) will not disable interrupts. > > @@ -558,11 +567,7 @@ void kernfs_put(struct kernfs_node *kn) > > return; > > root = kernfs_root(kn); > > repeat: > > - /* > > - * Moving/renaming is always done while holding reference. > > - * kn->parent won't change beneath us. > > - */ > > - parent = kn->parent; > > + parent = kernfs_parent(kn); > > Not a strong opinion but I'd keep the comment. Reader can go read the > definition of kernfs_parent() but no harm in explaining the subtlety where > it's used. Okay. will bring it back. > > @@ -1376,7 +1388,7 @@ static void kernfs_activate_one(struct kernfs_node *kn) > > if (kernfs_active(kn) || (kn->flags & (KERNFS_HIDDEN | KERNFS_REMOVING))) > > return; > > > > - WARN_ON_ONCE(kn->parent && RB_EMPTY_NODE(&kn->rb)); > > + WARN_ON_ONCE(kernfs_parent(kn) && RB_EMPTY_NODE(&kn->rb)); > > Minor but this one can be rcu_access_pointer() too. ok. > > @@ -1794,7 +1813,7 @@ static struct kernfs_node *kernfs_dir_pos(const void *ns, > > { > > if (pos) { > > int valid = kernfs_active(pos) && > > - pos->parent == parent && hash == pos->hash; > > + kernfs_parent(pos) == parent && hash == pos->hash; > > Ditto with rcu_access_pointer(). Using kernfs_parent() here is fine too but > it's a bit messy to mix the two for similar cases. Let's stick to either > rcu_access_pointer() or kernfs_parent(). I make both (kernfs_activate_one() and kernfs_dir_pos) use rcu_access_pointer() then. > > diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h > > index b42ee6547cdc1..c43bee18b79f7 100644 > > --- a/fs/kernfs/kernfs-internal.h > > +++ b/fs/kernfs/kernfs-internal.h > > @@ -64,11 +66,14 @@ struct kernfs_root { > > * > > * Return: the kernfs_root @kn belongs to. > > */ > > -static inline struct kernfs_root *kernfs_root(struct kernfs_node *kn) > > +static inline struct kernfs_root *kernfs_root(const struct kernfs_node *kn) > > { > > + const struct kernfs_node *knp; > > /* if parent exists, it's always a dir; otherwise, @sd is a dir */ > > - if (kn->parent) > > - kn = kn->parent; > > + guard(rcu)(); > > + knp = rcu_dereference(kn->__parent); > > + if (knp) > > + kn = knp; > > return kn->dir.root; > > } > > This isn't a new problem but the addition of the rcu guard makes it stick > out more: What keeps the returned root safe to dereference? As far as I understand it kernfs_root is around as long as the filesystem itself is around which means at least one node needs to stay. If you have a pointer to a kernfs_node you should own a reference. The RCU section is only needed to ensure that the (current) __parent is not replaced and then deallocated before the caller had a chance to obtain the root pointer. > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > > index d9061bd55436b..214aa378936cd 100644 > > --- a/kernel/cgroup/cgroup.c > > +++ b/kernel/cgroup/cgroup.c > > @@ -633,9 +633,22 @@ int cgroup_task_count(const struct cgroup *cgrp) > > return count; > > } > > > > +static struct cgroup *kn_get_priv(struct kernfs_node *kn) > > +{ > > + struct kernfs_node *parent; > > + /* > > + * The parent can not be replaced due to KERNFS_ROOT_INVARIANT_PARENT. > > + * Therefore it is always safe to dereference this pointer outside of a > > + * RCU section. > > + */ > > + parent = rcu_dereference_check(kn->__parent, > > + kernfs_root_flags(kn) & KERNFS_ROOT_INVARIANT_PARENT); > > + return parent->priv; > > +} > > kn_priv()? Oh, yes. > Thanks. Sebastian