Re: [PATCH 3/4] vfs: count unlinked inodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 21, 2011 at 12:11:32PM +0100, Miklos Szeredi wrote:
> @@ -241,6 +242,11 @@ void __destroy_inode(struct inode *inode)
>  	BUG_ON(inode_has_buffers(inode));
>  	security_inode_free(inode);
>  	fsnotify_inode_delete(inode);
> +	if (!inode->i_nlink) {
> +		WARN_ON(atomic_long_read(&inode->i_sb->s_remove_count) == 0);
> +		atomic_long_dec(&inode->i_sb->s_remove_count);
> +	}

Umm...  That relies on ->destroy_inode() doing nothing stupid; granted,
all work on actual file removal should've been done in ->evice_inode()
leaving only (RCU'd) freeing of in-core, but there are odd ones that
do strange things in ->destroy_inode() and I'm not sure that it's not
a Yet Another Remount Race(tm).  OTOH, it's clearly not worse than what
we used to have; just something to keep in mind for future work.

Anyway, I'm mostly OK with that series; I still hate your per-superblock
list of vfsmounts, but at least on top of the vfsmount-guts series they
won't be a temptation for abuse - list goes through struct mount now,
so filesystems won't be able to do fun things like "iterate through all
places where I'm mounted" (and #include "../mounts.h" in any fs code
will be a shootable offense - at least that is easy to spot).

There is another thing I'm less than happy about - suppose you have a
corrupted fs and run into zero on-disk i_nlink.  Sure, the inode will
get immediately evicted and __destroy_inode() will happen; however, for
the duration of that window you end up with bumped ->s_remove_count.
Transient EROFS is annoying, but tolerable - we only hit it if attempt
to remount r/o fails in ->remount_fs().  But this is something different -
it's a transient -EBUSY on attempt to remount r/o happening when nothing
actually is trying to do any kind of write access at all.  As it is,
you have ->s_remove_count equal to the number of in-core inodes with
zero ->i_nlink that had not yet reached destroy_inode().  Hell knows...
Maybe we want two versions of set_nlink(); one doing what yours does,
another returning -EINVAL if asked to set i_nlink to 0.  And assorted
foo_read_inode() would use the latter.  Anyway, that's a separate work;
so's the analysis of what happens if directory entry points to on-disk
inode with zero i_nlink.

Applied, with rebase on top of vfsmount-guts.  Will push the whole pile
into #for-next as soon as I finish sorting out conflicts in btrfs patches
versus btrfs tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux