Re: [RFC PATCH] ext4: Fix fallocate to update the file size in each transaction.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eric Sandeen wrote:
> Aneesh Kumar K.V wrote:
>> ext4_fallocate need to update file size in each transaction. Otherwise
>> ife we crash the file size won't be updated. We were also not marking
>> the inode dirty after updating file size before. 
> 
> This led to losing the size update when unmounted...
> 
>> Also when we try to
>> retry allocation due to ENOSPC make sure we reset the variable ret so
>> that we actually do a retry.
> 
> That's a few alsos.  Should this be more than one patch when committed?
> 
> -Eric

Oops that "-Eric" was too soon, there are more comments below ;)

>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
>> ---
>>  fs/ext4/extents.c |   78 +++++++++++++++++++++--------------------------------
>>  1 files changed, 31 insertions(+), 47 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index dcdf92a..09dd3c5 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -2783,6 +2783,26 @@ int ext4_ext_writepage_trans_blocks(struct inode *inode, int num)
>>  	return needed;
>>  }
>>  
>> +static void ext4_falloc_update_inode(struct inode *inode,
>> +					int mode, loff_t new_size)
>> +{
>> +	struct timespec now;
>> +
>> +	now = current_fs_time(inode->i_sb);
>> +	if (!timespec_equal(&inode->i_ctime, &now))
>> +		inode->i_ctime = now;
> 
> This used to depend on blocks actually being allocated; it looks like it
> doesn't anymore?
> 
>> +	/*
>> +	 * Update only when preallocation was requested beyond
>> +	 * the file size.
>> +	 */
>> +	if (!(mode & FALLOC_FL_KEEP_SIZE) &&
>> +				new_size > i_size_read(inode)) {
>> +		i_size_write(inode, new_size);
>> +		EXT4_I(inode)->i_disksize = i_size_read(inode);
>> +	}
>> +
>> +}
> 
> Ok, I think this is all cleaner...
> 
>>  /*
>>   * preallocate space for a file. This implements ext4's fallocate inode
>>   * operation, which gets called from sys_fallocate system call.
>> @@ -2794,8 +2814,8 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
>>  {
>>  	handle_t *handle;
>>  	ext4_lblk_t block;
>> +	loff_t new_size;
>>  	unsigned long max_blocks;
>> -	ext4_fsblk_t nblocks = 0;
>>  	int ret = 0;
>>  	int ret2 = 0;
>>  	int retries = 0;
>> @@ -2814,8 +2834,7 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
>>  		return -ENODEV;
>>  
>>  	block = offset >> blkbits;
>> -	max_blocks = (EXT4_BLOCK_ALIGN(len + offset, blkbits) >> blkbits)
>> -			- block;
>> +	max_blocks = EXT4_BLOCK_ALIGN(len, blkbits) >> blkbits;
> 
> Ok, this makes much more sense IMHO.
> 
>>  
>>  	/*
>>  	 * credits to insert 1 extent into extent tree + buffers to be able to
>> @@ -2848,28 +2867,13 @@ retry:
>>  			ret2 = ext4_journal_stop(handle);
>>  			break;
>>  		}
>> -		if (ret > 0) {
>> -			/* check wrap through sign-bit/zero here */
>> -			if ((block + ret) < 0 || (block + ret) < block) {
>> -				ret = -EIO;
>> -				ext4_mark_inode_dirty(handle, inode);
>> -				ret2 = ext4_journal_stop(handle);
>> -				break;
>> -			}
>> -			if (buffer_new(&map_bh) && ((block + ret) >
>> -			    (EXT4_BLOCK_ALIGN(i_size_read(inode), blkbits)
>> -			    >> blkbits)))
>> -					nblocks = nblocks + ret;
>> -		}
>> +		if ((block + ret) >= (EXT4_BLOCK_ALIGN(offset + len,
>> +						blkbits) >> blkbits))
>> +			new_size = offset + len;
>> +		else
>> +			new_size = (block + ret) << blkbits;
> 
> Since we called ext4_get_blocks_wrap with max_blocks, will we really end
> up with more blocks allocated than we asked for?  And do we no longer
> need the wrap checks?  (I'd expect that to be tested higher up with the
> original offset+len requested, no?)
> 
>> -		/* Update ctime if new blocks get allocated */
>> -		if (nblocks) {
>> -			struct timespec now;
>> -
>> -			now = current_fs_time(inode->i_sb);
>> -			if (!timespec_equal(&inode->i_ctime, &now))
>> -				inode->i_ctime = now;
>> -		}
>> +		ext4_falloc_update_inode(inode, mode, new_size);
>>  
>>  		ext4_mark_inode_dirty(handle, inode);
>>  		ret2 = ext4_journal_stop(handle);
>> @@ -2877,30 +2881,10 @@ retry:
>>  			break;
>>  	}
>>  
>> -	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
>> +	if (ret == -ENOSPC &&
>> +			ext4_should_retry_alloc(inode->i_sb, &retries)) {
>> +		ret = 0;
>>  		goto retry;
>> -
>> -	/*
>> -	 * Time to update the file size.
>> -	 * Update only when preallocation was requested beyond the file size.
>> -	 */
>> -	if (!(mode & FALLOC_FL_KEEP_SIZE) &&
>> -	    (offset + len) > i_size_read(inode)) {
>> -		if (ret > 0) {
>> -			/*
>> -			 * if no error, we assume preallocation succeeded
>> -			 * completely
>> -			 */
>> -			i_size_write(inode, offset + len);
>> -			EXT4_I(inode)->i_disksize = i_size_read(inode);
>> -		} else if (ret < 0 && nblocks) {
>> -			/* Handle partial allocation scenario */
>> -			loff_t newsize;
>> -
>> -			newsize  = (nblocks << blkbits) + i_size_read(inode);
>> -			i_size_write(inode, EXT4_BLOCK_ALIGN(newsize, blkbits));
>> -			EXT4_I(inode)->i_disksize = i_size_read(inode);
>> -		}
>>  	}
>>  
>>  	mutex_unlock(&inode->i_mutex);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux