On Jun 09, 2006 22:41 -0700, Valerie Henson wrote: > On Fri, Jun 09, 2006 at 11:25:02PM -0600, Andreas Dilger wrote: > > This needs some extra data in the directory entry, which I've already > > been thinking about for ext3, so if you are looking at implementing > > this for ext3 I'd be happy to share some ideas. > > Actually, it seems vaguely possible this could be implemented as a > layer on top of any normal file system - just use files to store > continuation inodes and the like. Then you could use the file system > that best suits your workload underneath. That is basically Lustre. One filesystem (the metadata filesystem, MDS) holds just the pathnames and some EA data that points to other files (these are essentially "file continuation inodes"). The data filesystems (object storage filesystems, OST) have the file data RAID0 striped over multipe OST "objects". The objects are just regular files stored in ext3 filesystems. In clustered metadata Lustre (CMD) there are also continuation inodes for files in a single directory, but currently a 2TB MDS filesystem is plenty big for holding just filenames and inodes. The same problems exist with Lustre that you have to face with the continuation inode scheme - files that grow too large for a single chunk, cross-chunk namespace links, etc. Of course we'd be thrilled if there was a desire to implement Lustre at a completely local-filesystem level (removing a lot of the networking and required recovery mechanism), though it would also be desirable to have the ability to move a filesystem from a local box to a distributed filesystem (ala X11) without any changes. > (Suparna has a paper in the next OLS talking about something related > but not identical, check it out.) Interesting, I'll have to take a look. > forking might get rid of some constraints - e.g., an XFS fork could > get rid of a lot of crufty compat code. It continually amazes me that XFS even made it into the kernel as it currently stands, because of the normally vehement objections to any kind of abstraction of code. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html