Re: ext4_page_mkwrite and delalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2008-06-12 at 23:44 +0530, Aneesh Kumar K.V wrote:
> Hi,
> 
> With delalloc we should not do writepage in ext4_page_mkwrite. The idea
> with delalloc is to delay the block allocation and make sure we allocate
> chunks of blocks together at writepages. So i guess we should update
> ext4_page_mkwrite to use write_begin and write_end instead of writepage.

I agree with delayed allocation page_mkwrite is much simplier, just to
block reservation to prevent ENOSPC

> Taking i_alloc_sem should protect against parallel truncate and the page
> lock should protect against parallel write_begin/write_end.
> 
> How about the patch below ?
> 

Do we plan to support page_mkwrite for non delalloc? the following patch
seems suggesting that we only do page_mkwrite with delalloc?

> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index cac132b..7f162cc 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3543,18 +3543,6 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
>  	return err;
>  }
> 
> -static int ext4_bh_prepare_fill(handle_t *handle, struct buffer_head *bh)
> -{
> -	if (!buffer_mapped(bh)) {
> -		/*
> -		 * Mark buffer as dirty so that
> -		 * block_write_full_page() writes it
> -		 */
> -		set_buffer_dirty(bh);
> -	}
> -	return 0;
> -}
> -
>  static int ext4_bh_unmapped(handle_t *handle, struct buffer_head *bh)
>  {
>  	return !buffer_mapped(bh);
> @@ -3596,24 +3584,22 @@ int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page)
>  		if (!walk_page_buffers(NULL, page_buffers(page), 0, len, NULL,
>  				       ext4_bh_unmapped))
>  			goto out_unlock;
> -		/*
> -		 * Now mark all the  buffer head dirty so
> -		 * that writepage can write it
> -		 */
> -		walk_page_buffers(NULL, page_buffers(page), 0, len,
> -					NULL, ext4_bh_prepare_fill);
>  	}
>  	/*
> -	 * OK, we need to fill the hole... Lock the page and do writepage.
> -	 * We can't do write_begin and write_end here because we don't
> -	 * have inode_mutex and that allow parallel write_begin, write_end call.
> +	 * OK, we need to fill the hole... Lock the page and do write_begin
> +	 * write_end. We are not holding inode.i__mutex here. That allow
> +	 * parallel write_begin, write_end call.
>  	 * (lock_page prevent this from happening on the same page though)
>  	 */
> -	lock_page(page);
> -	wbc.range_start = page_offset(page);
> -	wbc.range_end = page_offset(page) + len;
> -	ret = mapping->a_ops->writepage(page, &wbc);
> -	/* writepage unlocks the page */
> +	ret = mapping->a_ops->write_begin(file, mapping, page_offset(page),
> +			len, AOP_FLAG_UNINTERRUPTIBLE, &page, NULL);

What is this AOP_FLAG_UNINTERRUPTIBLE flag ? Also shouldn't we test
delalloc is enabled?

> +	if (ret < 0)
> +		goto out_unlock;
> +	ret = mapping->a_ops->write_end(file, mapping, page_offset(page),
> +			len, len, page, NULL);

I am still puzzled why we need to mark the page dirty in write_end here.
Thought only do block reservation in write_begin is enough, we haven't
write anything yet...

Mingming
> +	if (ret < 0)
> +		goto out_unlock;
> +	ret = 0;
>  out_unlock:
>  	up_read(&inode->i_alloc_sem);
>  	return ret;
> 
> 
> 
> If we agree i will send an updated ext4_page_mkwrite.patch and other
> related patches that needed to be updated so that the patch queue apply
> cleanly. 
> 
> -aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux