Re: [PATCH -V2] ext4: Drop mapped buffer_head check during page_mkwrite

"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> · Mon, 31 Aug 2009 22:36:01 +0530

On Mon, Aug 31, 2009 at 08:50:25AM -0400, Theodore Tso wrote:
> On Mon, Aug 31, 2009 at 06:03:14PM +0530, Aneesh Kumar K.V wrote:
> > If the database is not being updated via a write(2), then even though
> > the blocks are already allocated, we won't find buffer_heads attached to the page.
> > 
> > ie, page_buffers(page) will be NULL
> > 
> > The page_mkwrite -> write_begin  path would be allocating the buffer_heads
> > and attaching them to the page. So even in the above case we will be
> > doing write_begin -> write_end. That is, it is similar to the (a) i wrote
> > above.
> 
> What about the case where they are being updated via llseek(2) and
> write(2)?  I'll grant that isn't as common these days (dbm used to do
> it, but these days most people use berk_db, which does use mmap), but
> it's not a totally unknown thing to do.  Certainly any of the
> e2fsprogs tools operating on a filesystem-image-in-a-file (which isn't
> that uncommon if you are using KVM or some other virtualization
> situation) uses llseek(2) and write(2).  I'd have to check to see
> whether KVM/qemu is using mmap(2) or write(2).
> 
> If we think when we update-in-place already allocated blocks, it's
> much more common to use mmap(2) than lseek(2)/write(2), then I can see
> how avoiding taking a page_lock in ext4_page_mkwrite() might be the
> right choice.  On the other hand, if write(2) is more common, we'll be
> starting and stopping a transaction handle, and going through a *much*
> more complicated code path.
> 
> The other question I have then is that there are multiple
> write_begin/write_end functions that could be used, if we are going to
> be dropping this check in ext4_page_mkwrite() and depending in
> write_begin/write_end to do the right thing.  (ext4_write_begin,
> ext4_da_write_begin, ext4_ordered_write_end,
> ext4_journalled_write_end, ext4_writeback_write_end).  You did check
> all of the possible code path combinations, to make sure they will do
> the right thing?

Both ext4_write_begin and ext4_da_write_begin use block_write_begin
which calls __block_prepare_write which looks at the mapped flag of the
buffer_head and call get_block if not mapped. Delayed alloc get_block
does block reservation and returns a mapped buffer_head and non delayed
alloc get_block does block allocation and returns a mapped buffer_head.
So in both the case i guess we are ok

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html