On Sun, Aug 4, 2019 at 2:09 PM John Gallagher <john.gallagher@xxxxxxxxxxx> wrote: > > If a client mounts a subdirectory of an export using NFSv3, the server can end > up with invalid (CACHE_VALID bit unset) entries in its export cache. These > entries indirectly have a reference to the struct mount* containing the export, > preventing the filesystem containing the export from being unmounted, even > after the export has been unexported: > > /tmp# mount --bind foo bar > /tmp# exportfs -o fsid=7 '*:/tmp/bar' > /tmp# mount -t nfs -o vers=3 localhost:/tmp/bar/a /mnt/a > /tmp# umount /mnt/a > /tmp# exportfs -u '*:/tmp/bar' > /tmp# umount /tmp/bar > umount: /tmp/bar: target is busy. > /tmp# cat /proc/net/rpc/nfsd.export/content > #path domain(flags) > # /tmp/bar/a *() > > It looks like what's happening is that when rpc.mountd does a downcall to get a > filehandle corresponding to a particular path, exp_parent() traverses the > elements in the given path looking for one which is in the export cache. If > there isn't a valid entry in the cache for the given path, which there won't be > for a subdirectory of an export, then sunrpc_cache_lookup_rcu() inserts an > new, invalid entry. Looking into this a bit further, it seems that this is a regression, probably introduced by d6fc8821c2d2aba4cc18447a467f543e46e7367d in 4.13. That commit adds the additional check of the CACHE_VALID bit in cache_is_expired() which prevents these entries from ever being considered expired, and therefore from ever being flushed. I can think of at least a few possible approaches for fixing this: 1. Prevent exp_parent() from adding these entries. I suspect they aren't very useful anyway, since, being invalid, they aren't actually caching any info from userspace. Perhaps we could add a new helper function for caches which does lookups without adding a new entry when it doesn't find an existing entry. 2. Find a way to make the check in cache_is_expired() more specific, so that it solves the issue from d6fc8821, but still allows these entries to be considered expired when we flush the cache. 3. Find some alternate way to solve the issue from d6fc8821 which doesn't affect the way caches are flushed. I haven't looked at the issue solved by d6fc8821 yet, so I don't know how practical 2 and 3 are. Suggestions on what approach might be best would be welcome. Once I get an idea on how best to proceed, I'd be happy to put a patch together. -John