Manuel Schölling <manuel.schoelling@xxxxxx> wrote: > [485208.579361] CacheFiles: Error: Unexpected object collision > [485208.579364] object: OBJ1b354 > [485208.579367] objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0] > [485208.579369] ops=0 inp=0 exc=0 > [485208.579371] parent=ffff88053f5417c0 > [485208.579373] cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27] > [485208.579375] key=[8] '2490000000000000' > [485208.579381] xobject: OBJ1a600 > [485208.579384] xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0] > [485208.579386] xops=0 inp=0 exc=0 > [485208.579387] xparent=ffff88053f5417c0 > [485208.579389] xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd= (null) fl=12] On the face of it, this looks like the first object should just be waiting for the second. The flags on the first object (fl=8) are: FSCACHE_OBJECT_IS_LIVE and the flags on the second object (fl=70) are: FSCACHE_OBJECT_IS_LOOKED_UP FSCACHE_OBJECT_IS_AVAILABLE FSCACHE_OBJECT_RETIRED I think that this test: --> if (fscache_object_is_live(&object->fscache)) { pr_err("\n"); pr_err("Error: Unexpected object collision\n"); cachefiles_printk_object(object, xobject); BUG(); } is looking at the wrong object... Also xobject->flags is 1, which is: CACHEFILES_OBJECT_ACTIVE so we should just proceed to the part following the above if-statement where we wait for this to be cleared. Does this patch fix this oops for you? David --- commit cc0d3e7246ace3f5b695eb6a144461e041566f24 Author: David Howells <dhowells@xxxxxxxxxx> Date: Thu Sep 25 11:10:06 2014 +0100 CacheFiles: Fix incorrect test for in-memory object collision When CacheFiles cache objects are in use, they have in-memory representations, as defined by the cachefiles_object struct. These are kept in a tree rooted in the cache and indexed by dentry pointer (since there's a unique mapping between object index key and dentry). Collisions can occur between a representation already in the tree and a new representation being set up because it takes time to dispose of an old representation - particularly if it must be unlinked or renamed. When such a collision occurs, cachefiles_mark_object_active() is meant to check to see if the old, already-present representation is in the process of being discarded (ie. FSCACHE_OBJECT_IS_LIVE is not set on it) - and, if so, wait for the representation to be removed (ie. CACHEFILES_OBJECT_ACTIVE is then cleared). However, the test for whether the old representation is still live is checking the new object - which always will be live at this point. This leads to an oops looking like: CacheFiles: Error: Unexpected object collision object: OBJ1b354 objstate=LOOK_UP_OBJECT fl=8 wbusy=2 ev=0[0] ops=0 inp=0 exc=0 parent=ffff88053f5417c0 cookie=ffff880538f202a0 [pr=ffff8805381b7160 nd=ffff880509c6eb78 fl=27] key=[8] '2490000000000000' xobject: OBJ1a600 xobjstate=DROP_OBJECT fl=70 wbusy=2 ev=0[0] xops=0 inp=0 exc=0 xparent=ffff88053f5417c0 xcookie=ffff88050f4cbf70 [pr=ffff8805381b7160 nd= (null) fl=12] ------------[ cut here ]------------ kernel BUG at fs/cachefiles/namei.c:200! ... Workqueue: fscache_object fscache_object_work_func [fscache] ... RIP: ... cachefiles_walk_to_object+0x7ea/0x860 [cachefiles] ... Call Trace: [<ffffffffa04dadd8>] ? cachefiles_lookup_object+0x58/0x100 [cachefiles] [<ffffffffa01affe9>] ? fscache_look_up_object+0xb9/0x1d0 [fscache] [<ffffffffa01afc4d>] ? fscache_parent_ready+0x2d/0x80 [fscache] [<ffffffffa01b0672>] ? fscache_object_work_func+0x92/0x1f0 [fscache] [<ffffffff8107e82b>] ? process_one_work+0x16b/0x400 [<ffffffff8107fc16>] ? worker_thread+0x116/0x380 [<ffffffff8107fb00>] ? manage_workers.isra.21+0x290/0x290 [<ffffffff81085edc>] ? kthread+0xbc/0xe0 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80 [<ffffffff81502d0c>] ? ret_from_fork+0x7c/0xb0 [<ffffffff81085e20>] ? flush_kthread_worker+0x80/0x80 Reported-by: Manuel Schölling <manuel.schoelling@xxxxxx> Signed-off-by: David Howells <dhowells@xxxxxxxxxx> diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 83e9c94ca2cf..edd0961c20df 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -189,7 +189,7 @@ try_again: /* an old object from a previous incarnation is hogging the slot - we * need to wait for it to be destroyed */ wait_for_old_object: - if (fscache_object_is_live(&object->fscache)) { + if (fscache_object_is_live(&xobject->fscache)) { pr_err("\n"); pr_err("Error: Unexpected object collision\n"); cachefiles_printk_object(object, xobject); -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs