On Fri, Feb 19, 2010 at 10:14 AM, David Howells <dhowells@xxxxxxxxxx> wrote: > cachefiles_delete_object() can race with rename. It gets the parent directory > of the object it's asked to delete, then locks it - but rename may have changed > the object's parent between the get and the completion of the lock. > > However, if such a circumstance is detected, we abandon our attempt to delete > the object - since it's no longer in the index key path, it won't be seen > again by lookups of that key. The assumption is that cachefilesd may have > culled it by renaming it to the graveyard for later destruction. > > Signed-off-by: David Howells <dhowells@xxxxxxxxxx> > --- > > fs/cachefiles/namei.c | 12 +++++++++++- > 1 files changed, 11 insertions(+), 1 deletions(-) > > diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c > index 14ac480..eeb4986 100644 > --- a/fs/cachefiles/namei.c > +++ b/fs/cachefiles/namei.c > @@ -348,7 +348,17 @@ int cachefiles_delete_object(struct cachefiles_cache *cache, > dir = dget_parent(object->dentry); > > mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT); > - ret = cachefiles_bury_object(cache, dir, object->dentry); > + > + /* we need to check that our parent is _still_ our parent - it may have > + * been renamed */ > + if (dir == object->dentry->d_parent) { > + ret = cachefiles_bury_object(cache, dir, object->dentry); > + } else { > + /* it got moved, presumably by cachefilesd culling it, so it's > + * no longer in the key path and we can ignore it */ > + mutex_unlock(&dir->d_inode->i_mutex); > + ret = 0; > + } > > dput(dir); > _leave(" = %d", ret); > Dunno if this patch was related to the other thread where I mentioned the "Unlink failed" error. But since it's coming from the called function -- cachefiles_bury_object -- of the patched function I'm guessing it's related, so I figured I'd mention that with 2.6.32.9 (with the above patch applied) this morning, it's still giving the same error: Feb 24 13:50:31 server kernel: [ 3.064935] FS-Cache: Loaded Feb 24 13:50:31 server kernel: [ 3.082345] CacheFiles: Loaded Feb 24 13:50:31 server kernel: [ 6.884175] FS-Cache: Netfs 'nfs' registered for caching Feb 24 13:50:34 server kernel: [ 78.658626] FS-Cache: Cache "mycache" added (type cachefiles) Feb 24 13:50:34 server kernel: [ 78.658631] CacheFiles: File cache on sdb1 registered Feb 24 14:10:43 server kernel: [ 1287.755925] CacheFiles: I/O Error: Unlink failed Feb 24 14:10:43 server kernel: [ 1287.755930] FS-Cache: Cache cachefiles stopped due to I/O error At the time of the error, the cache had close to 1gb in it: /dev/sdb1 14428928 987852 12708112 8% /var/cache/fscache I had looked a bit earlier at the number of files and it was over 35k at that point (so was probably another 0-10k files by the time the error occurred). Let me know if there's any info I can provide that would be helpful. -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs