Re: [PATCH RESEND v5] fat: editions to support fat_fallocate

Namjae Jeon <linkinjeon@xxxxxxxxx> · Thu, 2 May 2013 18:15:21 +0900



2013/5/2, OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>:
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> writes:
>
>> Namjae Jeon <linkinjeon@xxxxxxxxx> writes:
>>
>>>> Then, per-file discard fallocate space sounds like wrong. fallocate
>>>> space probably is inode attribute.
>>> Since, our preallocation will not be persistent after umount. So, we
>>> need to free up the space at some point.
>>> If we consider for normal pre-allocation in ext4, in that case also
>>> the blocks are removed in ext4_release_file when the last writer
>>> closes the file.
>>>
>>> ext4_release_file()
>>> {
>>> ...
>>> /* if we are the last writer on the inode, drop the block reservation */
>>> 	if ((filp->f_mode & FMODE_WRITE) &&
>>> 			(atomic_read(&inode->i_writecount) == 1) &&
>>> 		        !EXT4_I(inode)->i_reserved_data_blocks)
>>> 	{
>>> 		down_write(&EXT4_I(inode)->i_data_sem);
>>> 		ext4_discard_preallocations(inode);
>>> 		up_write(&EXT4_I(inode)->i_data_sem);
>>> 	}
>>>
>>> So, we will need to have this per file . May be the condition for
>>> checking is wrong which can be correct but the correctness points
>>> should be same. We can give a thought on using "i_writecount" for
>>> controlling the parallel write in FAT also.
>>> how do you think ?
>>
>> AFAIK, preallocation != fallocate. ext*'s preallocation was there at
>> before fallocation to optimize block allocation for user data blocks.
yes, this is correct , preallocation!= fallocate, we just adopted only
the "release part" from that approach
Sorry, Would you mind to adopt this approach :) ?

>>
>>>>>> I know. Question is, why do we need to initialize twice.
>>>>>>
>>>>>> 1) zeroed for uninitialized area, 2) then copy user data area. We
>>>>>> need
>>>>>> only either, right? This seems to be doing both for all fallocated
>>>>>> area.
>>>>> We did not initialize twice. We are using the ‘pos’ as the attribute
>>>>> to define zeroing length in case of pre-allocation.
>>>>> Zeroing out occurs till the ‘pos’ while actual write occur after
>>>>> ‘pos’.
>>>>> If we file size is 100KB and we pre-allocated till 1MB. Next if we try
>>>>> to write at 500KB,
>>>>> Then zeroing out will occur only for 100KB->500KB, after that there
>>>>> will be normal write. There is no duplication for the same space.
>>>>
>>>> Ah. Then write_begin() really initialize after i_size until page cache
>>>> boudary for append write? I wonder if this patch works correctly for
>>>> mmap.
>>> Since you already provided me review comments to check truncate and
>>> mmap, we checked all points for those cases.
>>
>> cluster size == 512b
>>
>> 1) create new file
>> 2) fallocate 100MB
>> 3) write(2) data for each 512b
>>
>> With this, write_begin() will be called for each 512b data. When we
>> allocates new page for this file, write_begin() writes data 0-512. Then,
>> we have to initialize 512-4096 by zero. Because mmap read maps 0-4096,
>> even if i_size == 512.
>>
>> Who is initializing area for 512-4096?
>
> From other view, I guess fat_zero_falloc_area() is for filling zero for
> 0-10000, in the following case?
>
>      1) create new file
>      2) lseek(10000)
>      3) write data by write(2)
>
> This job is for cont_write_begin(). If example is correct, why
> cont_write_begin() doesn't work? I guess, because get_block() doesn't
> set buffer_new() for those area.
>
> If above is correct, right implement to change get_block().
We will check your case.

Thanks~
>
> Thanks.
> --
> OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html