Quoting Matt Roper (2018-03-21 23:23:37) > There are cases where other parts of the kernel may wish to store data > associated with individual cgroups without building a full cgroup > controller. Let's add interfaces to allow them to register and lookup > this private data for individual cgroups. > > A kernel system (e.g., a driver) that wishes to register private data > for a cgroup should start by obtaining a unique private data key by > calling cgroup_priv_getkey(). It may then associate private data > with a cgroup by calling cgroup_priv_install(cgrp, key, ref) where 'ref' > is a pointer to a kref field inside the private data structure. The > private data may later be looked up by calling cgroup_priv_get(cgrp, > key) to obtain a new reference to the private data. Private data may be > unregistered via cgroup_priv_release(cgrp, key). > > If a cgroup is removed, the reference count for all private data objects > will be decremented. > > v2: Significant overhaul suggested by Tejun, Alexei, and Roman > - Rework interface to make consumers obtain an ida-based key rather > than supplying their own arbitrary void* > - Internal implementation now uses per-cgroup radixtrees which should > allow much faster lookup than the previous hashtable approach > - Private data is registered via kref, allowing a single private data > structure to potentially be assigned to multiple cgroups. > > v3: > - Spare warning fixes (kbuild test robot) > > Cc: Tejun Heo <tj@xxxxxxxxxx> > Cc: Alexei Starovoitov <ast@xxxxxx> > Cc: Roman Gushchin <guro@xxxxxx> > Cc: cgroups@xxxxxxxxxxxxxxx > Signed-off-by: Matt Roper <matthew.d.roper@xxxxxxxxx> > > fixup! cgroup: Allow registration and lookup of cgroup private data (v2) > --- > include/linux/cgroup-defs.h | 10 +++ > include/linux/cgroup.h | 7 ++ > kernel/cgroup/cgroup.c | 183 +++++++++++++++++++++++++++++++++++++++++++- > 3 files changed, 198 insertions(+), 2 deletions(-) > > diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h > index 9f242b876fde..9086d963cc0a 100644 > --- a/include/linux/cgroup-defs.h > +++ b/include/linux/cgroup-defs.h > @@ -427,6 +427,16 @@ struct cgroup { > /* used to store eBPF programs */ > struct cgroup_bpf bpf; > > + /* > + * cgroup private data registered by other non-controller parts of the > + * kernel. Insertions are protected by privdata_lock, lookups by > + * rcu_read_lock(). > + */ > + struct radix_tree_root privdata; > + > + /* Protect against concurrent insertions/deletions to privdata */ > + spinlock_t privdata_lock; > + > /* ids of the ancestors at each level including self */ > int ancestor_ids[]; > }; > diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h > index 473e0c0abb86..63d22dfa00bd 100644 > --- a/include/linux/cgroup.h > +++ b/include/linux/cgroup.h > @@ -833,4 +833,11 @@ static inline void put_cgroup_ns(struct cgroup_namespace *ns) > free_cgroup_ns(ns); > } > > +/* cgroup private data handling */ > +int cgroup_priv_getkey(void (*free)(struct kref *)); > +void cgroup_priv_destroykey(int key); > +int cgroup_priv_install(struct cgroup *cgrp, int key, struct kref *ref); > +struct kref *cgroup_priv_get(struct cgroup *cgrp, int key); > +void cgroup_priv_release(struct cgroup *cgrp, int key); > + > #endif /* _LINUX_CGROUP_H */ > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index 8cda3bc3ae22..3268a21e8158 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -81,8 +81,15 @@ EXPORT_SYMBOL_GPL(css_set_lock); > #endif > > /* > - * Protects cgroup_idr and css_idr so that IDs can be released without > - * grabbing cgroup_mutex. > + * ID allocator for cgroup private data keys; the ID's allocated here will > + * be used to index all per-cgroup radix trees. The radix tree built into > + * the IDR itself will store a key-specific function to be passed to kref_put. > + */ > +static DEFINE_IDR(cgroup_priv_idr); Fun two radixtrees, one to hold the (*free)() and the other the void*. Would not just a struct cgroup_privdata { struct rcu_head rcu; void (*free)(struct kref *); } suffice for subclassing? (Also this is a prime candidate for switching to XArray.) > +struct kref * > +cgroup_priv_get(struct cgroup *cgrp, int key) > +{ > + struct kref *ref; > + > + WARN_ON(cgrp->root != &cgrp_dfl_root); > + WARN_ON(key == 0); > + > + rcu_read_lock(); > + > + ref = radix_tree_lookup(&cgrp->privdata, key); > + if (ref) > + kref_get(ref); This is not safe, you need if (ref && !kref_get_unless_zero(ref)) ref = NULL; otherwise the cgroup_priv_release() may drop the last reference prior to you obtaining yours. Also requires the privdata to be RCU protected. -Chris -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html