Re: [PATCH v4] ovl: Improving syncfs efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 15, 2018 at 3:41 PM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>>
>> 在 2018年1月15日,下午6:19,Miklos Szeredi <miklos@xxxxxxxxxx> 写道:
>>
>> On Mon, Jan 15, 2018 at 11:10 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>>> On Mon, Jan 15, 2018 at 11:29 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>>>> On Mon, Jan 15, 2018 at 8:33 AM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>>>>> Currently syncfs(2) call on overlayfs just simply call sync_filesystem()
>>>>> on upper_sb to synchronize whole dirty inodes in upper filesystem
>>>>> regardless of the overlay ownership of the inode. In the use case of
>>>>> container, when multiple containers using the same underlying upper
>>>>> filesystem, it has some shortcomings as below.
>>>>>
>>>>> (1) Performance
>>>>> Synchronization is probably heavy because it actually syncs unnecessary
>>>>> inodes for target overlayfs.
>>>>>
>>>>> (2) Interference
>>>>> Unplanned synchronization will probably impact IO performance of
>>>>> irrelative container processes on the other overlayfs.
>>>>>
>>>>> This patch iterates upper inode list in overlayfs to only sync target
>>>>> dirty inodes and wait for completion. By doing this, It is able to reduce
>>>>> cost of synchronization and will not seriously impact IO performance of
>>>>> irrelative processes. In special case, when having very few dirty inodes
>>>>> and a lot of clean upper inodes in overlayfs, then iteration may waste
>>>>> time so that synchronization is slower than before, but we think the
>>>>> potential for performance improvement far surpass the potential for
>>>>> performance regression in most cases.
>>>>>
>>>>> Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxx>
>>>>> ---
>>>>> Changes since v3:
>>>>> - Introduce upper_inode list to organize ovl indoe which has upper inode.
>>>>
>>>> Where is the inode removed from the list?  Should be done in
>>>> ovl_destroy_inode().
>>>>
>>>> Problem is: we can drop the overlay inode even when the underlying
>>>> inode is dirty, so we need to sync the upper inode before letting it
>>>> go in ovl_destroy_inode().
>>>>
>>>
>>> Isn't that a bit counter to the lazy inode writeback approach?
>>
>> Not really.  The dcache still keeps inodes and cache in memory, until
>> there's memory pressure to get rid of the dentry itself.
>>
>> So this is about syncing the upper inode when the overlay dentry is
>> flushed out of the dcache, which should not interfere with lazy
>> writeback in general.
>>
>>> Won't this cause a change of behavior, e.g. by forcing data sync before
>>> returning from close of file after a large write?
>>
>> No, file keeps a ref on dentry, which keeps a ref on inode +
>> underlying dentries.
>>
>
> Hi Miklos, Amir
>
> Thanks for your comments for the patch.
>
> If the speed of ovl inode destruction is not serious problem, I would like to
> implement like below, what do you think?
>
> In the process of ovl inode destruction:
> (I will introduce evict operation and do below oprations there)
>
> 1. start syncing inode.
> 2. add current inode to sb->s_inodes_wb list
> 3. take sb->s_sync_lock and wait until all inode in the wait list finishing syncing.
> #Of course, we can only wait for specific inode, but is that really helping much?
>
> Maybe in the worst case, inode destruction will take time for the first inode,
> but I think it is not common for all inodes. The purpose of adding inode to
> sb->s_inodes_wb list is for data consistency of sync result, because I’m afraid that
> inode destruction will let dirty inode escaping from sync waiting list.
>

I don't see a flaw in this plan, but here is another alternative plan
to consider:

Implement drop_inode() operation (not evict_inode) as follows:

if inode->i_stat & I_WILL_FREE:
    return 1 /* ok to evict */
else if upper is dirty:
    increment i_refcount
    inode->i_stat |= I_WILL_FREE
    add to sb->s_inodes_wb list
    return 0 /* don't evict */
else:
    return 1 /* ok to evict */

If I am not wrong, this will let inode live a little longer with a
single refcount
on sb->s_inodes_wb list until a later umount or syncfs syncs the
sb->s_inodes_wb list and drops the last refcount.

This way on memory pressure, when many overlay inodes are evicted, we
do not sync and wait on them one at a time, but like most fs, we sync and
then wait on all of them on sync_filesystem() call.

Hope I am not sending you on a dangerous adventure, so maybe consider
the simple and slower implementation before trying this optimization.

Thanks,
Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux