Re: [PATCH v6] ovl: Improving syncfs efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> 在 2018年1月23日,上午2:45,Amir Goldstein <amir73il@xxxxxxxxx> 写道:
> 
> On Mon, Jan 22, 2018 at 6:00 PM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>> 
>>> 在 2018年1月22日,下午8:58,Amir Goldstein <amir73il@xxxxxxxxx> 写道:
>>> 
>>> On Mon, Jan 22, 2018 at 2:05 PM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>>>>> 
>>>>> 在 2018年1月22日,下午7:27,Amir Goldstein <amir73il@xxxxxxxxx> 写道:
>>>>> 
>>>>> On Mon, Jan 22, 2018 at 11:47 AM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>>>>>> Currently syncfs(2) call on overlayfs just simply call sync_filesystem()
>>>>>> on upper_sb to synchronize whole dirty inodes in upper filesystem
>>>>>> regardless of the overlay ownership of the inode. In the use case of
>>>>>> container, when multiple containers using the same underlying upper
>>>>>> filesystem, it has some shortcomings as below.
>>>>>> 
>>>>>> (1) Performance
>>>>>> Synchronization is probably heavy because it actually syncs unnecessary
>>>>>> inodes for target overlayfs.
>>>>>> 
>>>>>> (2) Interference
>>>>>> Unplanned synchronization will probably impact IO performance of
>>>>>> unrelated container processes on the other overlayfs.
>>>>>> 
>>>>>> This patch iterates upper inode list in overlayfs to only sync target
>>>>>> dirty inodes and wait for completion. By doing this, It is able to reduce
>>>>>> cost of synchronization and will not seriously impact IO performance of
>>>>>> unrelated processes. In special case, when having very few dirty inodes
>>>>>> and a lot of clean upper inodes in overlayfs, then iteration may waste
>>>>>> time so that synchronization is slower than before, but we think the
>>>>>> potential for performance improvement far surpass the potential for
>>>>>> performance regression in most cases.
>>>>>> 
>>>>>> Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxx>
>>>>>> ---
>>>>>> Changes since v5:
>>>>>> - Move sync related functions to new file sync.c.
>>>>>> - Introduce OVL_EVICT_PENDING flag to avoid race conditions.
>>>>>> - If ovl indoe flag is I_WILL_FREE or I_FREEING then syncfs
>>>>>> will wait until OVL_EVICT_PENDING to be cleared.
>>>>>> - Move inode sync operation into evict_inode from drop_inode.
>>>>>> - Call upper_sb->sync_fs with no-wait after sync dirty upper
>>>>>> inodes.
>>>>>> - Hold s_inode_wblist_lock until deleting item from list.
>>>>>> - Remove write_inode fuction because no more used.
>>>>>> 
> [...]
>>>>> 
>>>>> 
>>>>> 
>>>>>> +                               /* We need to do this as 'inode' can be freed by now */
>>>>> 
>>>>> Could oi_prev have been freed or removed from upper_inodes list?
>>>>> I don't see why not.
>>>>> You could try to continue iteration from iput_inode if it is not NULL,
>>>>> but what if it is NULL?
>>>> 
>>>> When iterating the list we either get reference or wait until the item gets deleted,
>>>> so I think there is no chance let previous inode becomes NULL.
>>>> 
>>> 
>>> Maybe I am missing something.
>>> When iterating the upper_inodes list you get a reference on current (oi)
>>> and maybe have a reference on previous via iput_inode.
>>> previous may have a reference from wb_list, but that reference can be lost
>>> while we are not holding the wblist lock. No?
>> 
>> The reference of previous inode can be lost if the inode is not our target,
>> but it must be happened after iput(iput_inode), before this will be safe.
>> 
> 
> I was talking about the case where iput_inode is NULL.

If iput_inode is NULL previous inode will not put reference until finishing wait,
so this case is safer than iput_inode having value. Meanwhile if inode has reference,
it won’t get deleted from list.


> 
>> When deleting item from list we take lock to protect from concurrent operations,
>> so if we have already got list lock, it will be safe to touch items in the list.
>> 
> 
> But we have let go of the list lock.

I don’t know what do you worry about, could you be more specific?

Thanks,
Chengguang.

--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux