Hi Ted, Thanks for the responce... I agree with you and I would prefer to send something more serious on that list than those previous patches - I like your idea with counters. Btw I assume crc is more preferable than just control sum for block group descriptors.... Pavel p.mironchik@xxxxxxxxxxx tibor@xxxxxxxxxxxx On 9/16/06, Theodore Tso <tytso@xxxxxxx> wrote:
On Tue, Sep 12, 2006 at 02:07:34PM +0300, Pavel Mironchik wrote: > > Ext2/3 does erase of inode tables, when do creation of new systems. > This is very very long operation when the target file system volume is more > than > 2Tb. Other filesystem are not affected by such huge delay on creation of > filesystem. My concern was to improve design of ext3 to decrease time > consuption of creation large ext3 volumes on storage servers. > In general to solve problem, we should defer job of cleaning nodes to > kernel. In e2fsprogs there is LAZY_BG options but it just avoids doing > erase of inodes only. Hi Pavel, Apologies that no one responded right away; I think a lot of people have been incredibly busy. I've been doing a huge amount of travel myself personally, and so my e-mail latency has been larger than normal. The problem of long mke2fs problems is one that we've considered, and we do want to do something with it, but it's not been as high priority as some of the other problems on our hit list. Certainly, given that inode space is very precious, I'm not convinced that breaking backwards compatibility and burning an extra 16 bytes per inode is worth the net gain --- although there are other solutions that don't have that particular cost. Yes, they take more lines of code to support, but given the hopefully large number of people that will be using ext4, I'd must rather spend an extra amount of development time getting it right, than doing something fast and dirty which then everyone pays for, over and over, again and again and again across millions and millions of machines! > I see several solutions for that problem: > 1) Add special bitmaps into fs header (inode groups descriptors?). > By looking at those bitmaps kernel could determine if inode is not cleaned, > and that inode will be propertly initialized. Actually, you don't need a bitmap; a much simpler solution is to have an integer field in the block group descriptors which indicates the number of inods that have been initialized in that block group. The problem though is that what if the block group descriptors (or the bitmaps) get corrupted? So what we also want to do is to add support for checksums in the individual inodes and in the block group descriptors themselves, as a double-check. These are useful features in and of themselves, and there are some sample implementations of them (for example, in the Iron ext2 paper). So my thinking is that we should get that work into ext4, and then it's not hard to add the support for fields in the block group descriptors that would allow for fast mke2fs support. Regards, - Ted
- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html