On Tue, Jul 21, 2015 at 04:34:07PM -0700, Andrew Morton wrote: > On Sun, 19 Jul 2015 15:31:10 +0300 Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> wrote: > > > This function returns the inode number of the closest online ancestor of > > the memory cgroup a page is charged to. It is required for exporting > > information about which page is charged to which cgroup to userspace, > > which will be introduced by a following patch. > > > > ... > > > > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -441,6 +441,29 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) > > return &memcg->css; > > } > > > > +/** > > + * page_cgroup_ino - return inode number of the memcg a page is charged to > > + * @page: the page > > + * > > + * Look up the closest online ancestor of the memory cgroup @page is charged to > > + * and return its inode number or 0 if @page is not charged to any cgroup. It > > + * is safe to call this function without holding a reference to @page. > > + */ > > +unsigned long page_cgroup_ino(struct page *page) > > Shouldn't it return an ino_t? Yep, thanks. > > > +{ > > + struct mem_cgroup *memcg; > > + unsigned long ino = 0; > > + > > + rcu_read_lock(); > > + memcg = READ_ONCE(page->mem_cgroup); > > + while (memcg && !(memcg->css.flags & CSS_ONLINE)) > > + memcg = parent_mem_cgroup(memcg); > > + if (memcg) > > + ino = cgroup_ino(memcg->css.cgroup); > > + rcu_read_unlock(); > > + return ino; > > +} > > The function is racy, isn't it? There's nothing to prevent this inode > from getting torn down and potentially reallocated one nanosecond after > page_cgroup_ino() returns? If so, it is only safely usable by things > which don't care (such as procfs interfaces) and this should be > documented in some fashion. Agree. Here goes the incremental patch: --- diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d644aadfdd0d..ad800e62cb7a 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -343,7 +343,7 @@ static inline bool mm_match_cgroup(struct mm_struct *mm, } struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page); -unsigned long page_cgroup_ino(struct page *page); +ino_t page_cgroup_ino(struct page *page); static inline bool mem_cgroup_disabled(void) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b9c76a0906f9..bd30638c2a95 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -448,8 +448,13 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) * Look up the closest online ancestor of the memory cgroup @page is charged to * and return its inode number or 0 if @page is not charged to any cgroup. It * is safe to call this function without holding a reference to @page. + * + * Note, this function is inherently racy, because there is nothing to prevent + * the cgroup inode from getting torn down and potentially reallocated a moment + * after page_cgroup_ino() returns, so it only should be used by callers that + * do not care (such as procfs interfaces). */ -unsigned long page_cgroup_ino(struct page *page) +ino_t page_cgroup_ino(struct page *page) { struct mem_cgroup *memcg; unsigned long ino = 0; -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html