On Wed, Oct 14, 2020 at 4:08 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Oct 14, 2020 at 01:45:28PM -0700, Andrii Nakryiko wrote: > > Fix data race in prepend_path() with re-reading mnt->mnt_ns twice without > > holding the lock. is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns) > > might re-read the pointer again which could be NULL already, if in between > > reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets mnt->mnt_ns > > to NULL. > > Cute... What config/compiler has resulted in that? I agree with the analysis, but Don't know for sure, but nothing special or exotic, a typical Facebook production kernel config and some version of GCC (I didn't check exactly which one). Just given enough servers in the fleet, with time and intensive workloads races like this, however unlikely, do surface up pretty regularly. > I really hate the open-coded (and completely unexplained) use of IS_ERR_OR_NULL() > there. > > > - if (is_mounted(vfsmnt) && !is_anon_ns(mnt->mnt_ns)) > > + mnt_ns = READ_ONCE(mnt->mnt_ns); > > + /* open-coded is_mounted() to use local mnt_ns */ > > + if (!IS_ERR_OR_NULL(mnt_ns) && !is_anon_ns(mnt_ns)) > > error = 1; // absolute root > > else > > error = 2; // detached or not attached yet > > Better turn that into an inlined helper in fs/mount.h, next to is_mounted(), IMO, > and kill that IS_ERR_OR_NULL garbage there. What that thing does is > if (ns == NULL || ns == MNT_NS_INTERNAL) > and it's *not* on any kind of hot path to warrant that kind of microoptimizations. Sounds good. I didn't know code well enough to infer NULL || MNT_NS_INTERNAL instead of IS_ERR_OR_NULL from is_mounted(), so just open-coded the latter. > > So let's make that > > static inline bool is_real_ns(struct mnt_namespace *mnt_ns) > { > return mnt_ns && mnt_ns != MNT_NS_INTERNAL; > } > > turn is_mounted(m) into is_real_ns(m->mnt_ns) and replace that line in your fix > with > if (is_real_ns(mnt_ns) && !is_anon_ns(mnt_ns)) > > Objections? Nope, I'll send a follow-up patch, thanks.