On Fri, 29 Jun 2007 16:55:25 -0400 Theodore Tso <tytso@xxxxxxx> wrote: > On Fri, Jun 29, 2007 at 01:01:20PM -0700, Andrew Morton wrote: > > > > Guys, Mike and Sreenivasa at google are looking into implementing > > fallocate() on ext2. Of course, any such implementation could and should > > also be portable to ext3 and ext4 bitmapped files. > > What's the eventual goal of this work? Would it be for mainline use, > or just something that would be used internally at Google? Mainline, preferably. > I'm not > particularly ennthused about supporting two ways of doing fallocate(); > one for ext4 and one for bitmap-based files in ext2/3/4. Is the > benefit reallyworth it? umm, it's worth it if you don't want to wear the overhead of journalling, and/or if you don't want to wait on the, err, rather slow progress of ext4. > What I would suggest, which would make much easier, is to make this be > an incompatible extensions (which you as you point out is needed for > security reasons anyway) and then steal the high bit from the block > number field to indicate whether or not the block has been initialized > or not. That way you don't end up having to seek to a potentially > distant part of the disk to check out the bitmap. Also, you don't > have to worry about how to recover if the "block initialized bitmap" > inode gets smashed. > > The downside is that it reduces the maximum size of the filesystem > supported by ext2 by a factor of two. But, there are at least two > patch series floating about that promise to allow filesystem block > sizes > than PAGE_SIZE which would allow you to recover the maximum > size supported by the filesytem. > > Furthermore, I suspect (especially after listening to a very fasting > Usenix Invited Talk by Jeffery Dean, a fellow from Google two weeks > ago) that for many of Google's workloads, using a filesystem blocksize > of 16K or 32K might not be a bad thing in any case. > > It would be a lot simpler.... > Hadn't thought of that. Also, it's unclear to me why google is going this way rather than using (perhaps suitably-tweaked) ext2 reservations code. Because the stock ext2 block allcoator sucks big-time. - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html