Re: Race between __sync_single_inode() and LogFS garbage collector

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2007-02-19 at 21:31 +0000, Jörn Engel wrote:
> Looks like I really write the first log-structured filesystem for Linux.
> At least I can into a fairly arcane race that seems to be generic to all
> of them.
> 
> Writing when space is tight may involve calling the garbage collector.
> The garbage collector will iget() random inodes, either to verify if a
> block is valid or to copy the block around.  At this point, all writes
> to LogFS are serialized.
> 
> __sync_single_inode() will first lock a random inode, then call
> write_inode(), then unlock the inode.  So we can get this:
> 
> 
> __sync_single_inode()			garbage collector
> ---------------------------------------------------------------------
> inode->i_state |= I_LOCK;		...
> ...					mutex_lock(&super->s_w_mutex);
> write_inode(inode, wait);		...
>   ...					iget(sb, ino);
>   mutex_lock(&super->s_w_mutex);	...
>   ...					  wait_on_inode(inode);
>   mutex_unlock(&super->s_w_mutex);	
>   ...					
> ...
> inode->i_state &= ~I_LOCK;
> 
> 
> And once in a blue moon, those two will race for the same inode.  As far
> as I can see, the race can only get fixed in two ways:
> 1. Never iget() inside the garbage collector.  That would require having
>    a private inode cache for LogFS.
> 2. Synchonize __sync_single_inode() and the garbage collector somehow.
> 
> Variant 1 would result in double caching for the same object, something
> I would like to avoid.  So does anyone have suggestions how variant 2
> could be achieved?  Essentially what I need is a way to say "don't sync
> any inodes right now, I'll be back in 5 milliseconds or so".

It'd be nice if you could drop s_w_mutex when the garbage collector
calls i_get().

Otherwise, you may be able to call ilookup5_nowait() in the garbage
collector, and skip that inode if I_LOCK is set.

> 
> Jörn
> 
-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux