Looks like I really write the first log-structured filesystem for Linux. At least I can into a fairly arcane race that seems to be generic to all of them. Writing when space is tight may involve calling the garbage collector. The garbage collector will iget() random inodes, either to verify if a block is valid or to copy the block around. At this point, all writes to LogFS are serialized. __sync_single_inode() will first lock a random inode, then call write_inode(), then unlock the inode. So we can get this: __sync_single_inode() garbage collector --------------------------------------------------------------------- inode->i_state |= I_LOCK; ... ... mutex_lock(&super->s_w_mutex); write_inode(inode, wait); ... ... iget(sb, ino); mutex_lock(&super->s_w_mutex); ... ... wait_on_inode(inode); mutex_unlock(&super->s_w_mutex); ... ... inode->i_state &= ~I_LOCK; And once in a blue moon, those two will race for the same inode. As far as I can see, the race can only get fixed in two ways: 1. Never iget() inside the garbage collector. That would require having a private inode cache for LogFS. 2. Synchonize __sync_single_inode() and the garbage collector somehow. Variant 1 would result in double caching for the same object, something I would like to avoid. So does anyone have suggestions how variant 2 could be achieved? Essentially what I need is a way to say "don't sync any inodes right now, I'll be back in 5 milliseconds or so". Jörn -- Courage is not the absence of fear, but rather the judgement that something else is more important than fear. -- Ambrose Redmoon - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html