Re: [PATCH] vfs: partially sanitize i_state zeroing on inode creation

Jan Kara <jack@xxxxxxx> · Tue, 11 Jun 2024 12:02:22 +0200

On Tue 11-06-24 06:15:40, Mateusz Guzik wrote:
> new_inode used to have the following:
> 	spin_lock(&inode_lock);
> 	inodes_stat.nr_inodes++;
> 	list_add(&inode->i_list, &inode_in_use);
> 	list_add(&inode->i_sb_list, &sb->s_inodes);
> 	inode->i_ino = ++last_ino;
> 	inode->i_state = 0;
> 	spin_unlock(&inode_lock);
> 
> over time things disappeared, got moved around or got replaced (global
> inode lock with a per-inode lock), eventually this got reduced to:
> 	spin_lock(&inode->i_lock);
> 	inode->i_state = 0;
> 	spin_unlock(&inode->i_lock);
> 
> But the lock acquire here does not synchronize against anyone.
> 
> Additionally iget5_locked performs i_state = 0 assignment without any
> locks to begin with and the two combined look confusing at best.
> 
> It looks like the current state is a leftover which was not cleaned up.
> 
> Ideally it would be an invariant that i_state == 0 to begin with, but
> achieving that would require dealing with all filesystem alloc handlers
> one by one.
> 
> In the meantime drop the misleading locking and move i_state zeroing to
> alloc_inode so that others don't need to deal with it by hand.
> 
> Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>

Good point. But the initialization would seem more natural in
inode_init_always(), wouldn't it? And that will also address your "FIXME"
comment.

								Honza

> ---
> 
> I diffed this against fs-next + my inode hash patch as it adds one
> i_state = 0 case. Should that patch not be accepted this bit can be
> easily dropped from this one.
> 
> I brought the entire thing up quite some time ago [1] and Dave Chinner
> noted that perhaps the lock has a side effect of providing memory
> barriers which otherwise would not be there and which are needed by
> someone.
> 
> For new_inode and alloc_inode consumers all fences are already there
> anyway due to immediate lock usage.
> 
> Arguably new_inode_pseudo escape without it but I don't find the code at
> hand to be affected in any meanignful way -- the only 2 consumers
> (get_pipe_inode and sock_alloc) perform numerous other stores to the
> inode immediately after. By the time it gets added to anything looking
> at i_state, flushing that should be handled by whatever thing which adds
> it. Mentioning this just in case.
> 
> [1] https://lore.kernel.org/all/CAGudoHF_Y0shcU+AMRRdN5RQgs9L_HHvBH8D4K=7_0X72kYy2g@xxxxxxxxxxxxxx/
> 
>  fs/inode.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 149adf8ab0ea..3967e68311a6 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -276,6 +276,10 @@ static struct inode *alloc_inode(struct super_block *sb)
>  		return NULL;
>  	}
>  
> +	/*
> +	 * FIXME: the code should be able to assert i_state == 0 instead.
> +	 */
> +	inode->i_state = 0;
>  	return inode;
>  }
>  
> @@ -1023,14 +1027,7 @@ EXPORT_SYMBOL(get_next_ino);
>   */
>  struct inode *new_inode_pseudo(struct super_block *sb)
>  {
> -	struct inode *inode = alloc_inode(sb);
> -
> -	if (inode) {
> -		spin_lock(&inode->i_lock);
> -		inode->i_state = 0;
> -		spin_unlock(&inode->i_lock);
> -	}
> -	return inode;
> +	return alloc_inode(sb);
>  }
>  
>  /**
> @@ -1254,7 +1251,6 @@ struct inode *iget5_locked(struct super_block *sb, unsigned long hashval,
>  		struct inode *new = alloc_inode(sb);
>  
>  		if (new) {
> -			new->i_state = 0;
>  			inode = inode_insert5(new, hashval, test, set, data);
>  			if (unlikely(inode != new))
>  				destroy_inode(new);
> @@ -1297,7 +1293,6 @@ struct inode *iget5_locked_rcu(struct super_block *sb, unsigned long hashval,
>  
>  	new = alloc_inode(sb);
>  	if (new) {
> -		new->i_state = 0;
>  		inode = inode_insert5(new, hashval, test, set, data);
>  		if (unlikely(inode != new))
>  			destroy_inode(new);
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR