Re: [PATCH v7] ovl: Improving syncfs efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> 在 2018年1月27日,上午12:08,Jan Kara <jack@xxxxxxx> 写道:
> 
> Hi Chengguang!
> 
> On Fri 26-01-18 11:02:03, Chengguang Xu wrote:
>>> AFAIU your code OVL_EVICT_PENDING should be always set when this function
>>> is called. Maybe just make a WARN_ON from this condition?
>> 
>> OK, I’ll add.
>> 
>>> 
>>>> +		wqh = bit_waitqueue(&oi->flags, OVL_EVICT_PENDING);
>>>> +		prepare_to_wait(wqh, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
>>>> +		sleep = ovl_test_flag(OVL_EVICT_PENDING, inode);
>> 
>> I think we also don’t have to check sleep again here because before unlock
>> ofs->upper_inodes_lock OVL_EVICT_PENDING will not be cleared. 
> 
> I'm not sure what you are suggesting here - i.e., where do you think you are
> checking the 'sleep' again?

Hi Jan,

In new version I changed to keep ovl_inode with OVL_EVICT_PENDING in &sync_tmp_list
and simply continue the main loop, by doing this I think we can simplify the check in
ovl_wait_evict_pending(). This seems the simplest way to me.

Thanks,
Chengguang.


> 
>>>> +		spin_unlock(&inode->i_lock);
>>>> +		spin_unlock(&ofs->upper_inodes_lock);
>>>> +		if (sleep)
>>>> +			schedule();
>>>> +		finish_wait(wqh, &wait.wq_entry);
>>>> +	}
>>>> +}
>>>> +
>>>> +/**
>>>> + * ovl_sync_inodes
>>>> + * @sb: The overlayfs super block
>>>> + *
>>>> + * upper_inodes list is used for organizing ovl inode which has upper inode,
>>>> + * by iterating the list to looking for and syncing dirty upper inode.
>>>> + *
>>>> + * When starting syncing inode, we add the inode to wait list explicitly,
>>>> + * in order to save memory we reuse inode->i_wb_list and sb->s_inodes_wb,
>>>> + * so those fields have slightly differnt meanings in overlayfs.
>>>> + */
>>>> +static void ovl_sync_inodes(struct super_block *sb)
>>>> +{
>>>> +	struct ovl_fs *ofs = sb->s_fs_info;
>>>> +	struct ovl_inode *oi;
>>>> +	struct inode *inode;
>>>> +	struct inode *upper_inode;
>>>> +	struct blk_plug plug;
>>>> +	LIST_HEAD(sync_tmp_list);
>>>> +
>>>> +	struct writeback_control wbc_for_sync = {
>>>> +		.sync_mode		= WB_SYNC_ALL,
>>>> +		.for_sync		= 1,
>>>> +		.range_start		= 0,
>>>> +		.range_end		= LLONG_MAX,
>>>> +		.nr_to_write		= LONG_MAX,
>>>> +	};
>>>> +
>>>> +	blk_start_plug(&plug);
>>>> +	spin_lock(&ofs->upper_inodes_lock);
>>>> +	list_splice_init(&ofs->upper_inodes, &sync_tmp_list);
>>>> +
>>>> +	while (!list_empty(&sync_tmp_list)) {
>>>> +		oi = list_first_entry(&sync_tmp_list, struct ovl_inode,
>>>> +						upper_inodes_list);
>>>> +		inode = &oi->vfs_inode;
>>>> +		spin_lock(&inode->i_lock);
>>>> +
>>>> +		list_move_tail(&oi->upper_inodes_list, &ofs->upper_inodes);
>>>> +		if (inode->i_state & I_NEW) {
>>>> +			spin_unlock(&inode->i_lock);
>>>> +			continue;
>>>> +		}
>>>> +
>>>> +		/*
>>>> +		 * If ovl inode state is I_FREEING,
>>>> +		 * left sync operation to the ovl_evict_inode(),
>>>> +		 * so wait here until OVL_EVICT_PENDING flag to be cleared.
>>>> +		 */
>>>> +		if (inode->i_state & I_FREEING) {
>>>> +			ovl_wait_evict_pending(inode);
>>> 
>>> This is still somewhat racy. The problem is that a wakeup in
>>> ovl_wait_evict_pending() can come from all sorts of reasons. You are not
>>> guaranteed OVL_EVICT_PENDING is cleared when you are woken up. You cannot
>>> easily recheck the flag as the inode *may* really be free by this time.
>> 
>> I think DEFINE_WAIT_BIT can wakeup sleeping thread only for right reason and
>> there is no other place to set OVL_EVICT_PENDING again. IIUC, we don’t have
>> to check OVL_EVICT_PENDING in a loop. Am I missing something?
> 
> You are correct that wake_up_bit() will only wake up process once it has
> cleared the bit. But there can be other wake up sources like signal
> delivery so you cannot assume you will get woken only after the inode is
> gone...
> 
>>> What I'd do to fix this race is not to move inode with OVL_EVICT_PENDING to
>>> the tail of ofs->upper_inodes but just keep it in &sync_tmp_list. That way
>>> we are sure that either the inode gets freed (thus removed from
>>> &sync_tmp_list) and we'll never see it again, or we'll again see it in
>>> &sync_tmp_list, see again OVL_EVICT_PENDING, and wait again for it which is
>>> what we need.
>> 
>> If my analysis above is not correct, I’ll modify as your suggestion here.
> 
> Please do.
> 
> 								Honza
> -- 
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux