Hi, On Fri, 2007-10-05 at 16:49 +0100, Gordan Bobic wrote: > Hi, > > I stumbled upon an old document from back in 2000 (before RedHat acquired > Sistina), and they were talking about a number of features for the "next > version", including shadowing/copy-on-write. > > The two features I am particularly interested in are: > > 1) Compression > I consider this to be important both for performance reasons and the fact > that no matter how cheap, disks will always be more expensive. > Performance-wise, at some point I/O becomes the bottleneck. Not > necessarily the disk I/O but network I/O of the SAN, especially when all > the nodes in the cluster are sharing the same SAN bandwidth. At that > point, reducing the data volume through compression becomes a performance > win. This point isn't all that difficult to reach even on a small cluster > on gigabit ethernet. > There are really two issues here rather than one: 1. Compression of data Has, as a prerequisite, "allocate on flush" as we would really need "compress on flush" in order to make this a viable option. Also we'd need hints as to what kind of data we are looking at in order to make it worthwhile. We'd also have to look at crypto too since you can't compress encrypted data, the compression must come first if its required. 2. Compression of metadata This might well be worth looking into. There is a considerable amount of redundancy in typical fs metadata, and we ought to be able to reduce the number of blocks we have to read/write in order to complete an operation in this way. Using extents for example could be considered a form of metadata compression. The main problem is that our "cache line" if you like in GFS(2) is one disk block, so that sharing between nodes is a problem (hence the one inode per block rule we have at the moment). We'd need to address the metadata migration issue first. Neither of the above is likely to happen soon though as they both require on-disk format changes. > 2) Shadowing/Copy-On-Write File Versioning > Backups have 2 purposes - retrieving a file that was lost or corrupted > through user error, and files lost or corrupted through disk failure. High > levels of RAID alleviate the need for backup for the latter reason, but > they do nothing to alleviate user-error caused damage. At the same time > SANs can get big - I don't see hundreds of TB to be an inconcievable size. > At this size, backups become an issue. Thus, a feature to provide file > versioning is important. > > In turn, 2) increases the volume of data, which increases the need for 1). > > Are either of these two features planned for GFS in the near future? > > Gordan > This also requires on-disk format changes, but I agree that it would be a nice thing to do. Its very much in my mind though as to what a suitable scheme would be. We do have an ever increasing patent minefield to walk through here too I suspect. Potentially it would be possible to address both of the above suggestions (minus the metadata compression) by using a stacking filesystem. That would be potentially more flexible by introducing the features on all filesystems not just GFS(2), Steve. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster