Re: [RFC] basic delayed allocation in VFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[please don't top post!]

On Thu, Jul 26, 2007 at 05:33:08PM +0400, Alex Tomas wrote:
> Jeff Garzik wrote:
> >The XFS one is proven and the work was already completed.
> >
> >What were the specific technical issues that made it unsuitable for ext4?
> >
> >I would rather not reinvent the wheel, particularly if the reinvention 
> >is less capable than the existing work.
>
> It duplicates fs/mpage.c in bio building and introduces new generic API
> (iomap, map_blocks_t, etc).

Using a new API for new functionality is a bad thing?

> In contrast, my trivial implementation re-use
> existing code in fs/mpage.c, doesn't introduce new API and I tend to think
> provides quite the same functionality. I can be wrong, of course ...

No, it doesn't provide the same functionality.

Firstly, XFS attaches a different I/O completion to delalloc writes
to allow us to update the file size when the write is beyond the
current on disk EOF. This code cannot do that as all it does is
allocation and present "normal looking" buffers to the generic code
path.

Secondly, apart from delalloc, XFS cannot use the generic code paths
for writeback because unwritten extent conversion also requires
custom I/O completion handlers. Given that __mpage_writepage() only
calls ->writepage when it is confused, XFS simply cannot use this
API.

Also, looking at the way mpage_da_map_blocks() is done - if we have
an 128MB delalloc extent - ext4 will allocate that will allocate it
in one go, right? What happens if we then crash after only writing a
few megabytes of that extent? stale data exposure? XFS can allocate
multiple gigabytes in a single get_blocks call so even if ext4 can't
do this, it's a problem for XFS.....

So without the ability to attach specific I/O completions to bios
or support for unwritten extents directly in __mpage_writepage,
there is no way XFS can use this "generic" delayed allocation code.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux