Re: fallocate support for bitmap-based files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 29 Jun 2007 16:55:25 -0400
Theodore Tso <tytso@xxxxxxx> wrote:

> On Fri, Jun 29, 2007 at 01:01:20PM -0700, Andrew Morton wrote:
> > 
> > Guys, Mike and Sreenivasa at google are looking into implementing
> > fallocate() on ext2.  Of course, any such implementation could and should
> > also be portable to ext3 and ext4 bitmapped files.
> 
> What's the eventual goal of this work?  Would it be for mainline use,
> or just something that would be used internally at Google?

Mainline, preferably.

>  I'm not
> particularly ennthused about supporting two ways of doing fallocate();
> one for ext4 and one for bitmap-based files in ext2/3/4.  Is the
> benefit reallyworth it?

umm, it's worth it if you don't want to wear the overhead of journalling,
and/or if you don't want to wait on the, err, rather slow progress of ext4.

> What I would suggest, which would make much easier, is to make this be
> an incompatible extensions (which you as you point out is needed for
> security reasons anyway) and then steal the high bit from the block
> number field to indicate whether or not the block has been initialized
> or not.  That way you don't end up having to seek to a potentially
> distant part of the disk to check out the bitmap.  Also, you don't
> have to worry about how to recover if the "block initialized bitmap"
> inode gets smashed.  
> 
> The downside is that it reduces the maximum size of the filesystem
> supported by ext2 by a factor of two.  But, there are at least two
> patch series floating about that promise to allow filesystem block
> sizes > than PAGE_SIZE which would allow you to recover the maximum
> size supported by the filesytem.
> 
> Furthermore, I suspect (especially after listening to a very fasting
> Usenix Invited Talk by Jeffery Dean, a fellow from Google two weeks
> ago) that for many of Google's workloads, using a filesystem blocksize
> of 16K or 32K might not be a bad thing in any case.
> 
> It would be a lot simpler....
> 

Hadn't thought of that.

Also, it's unclear to me why google is going this way rather than using
(perhaps suitably-tweaked) ext2 reservations code.

Because the stock ext2 block allcoator sucks big-time.
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux