On Fri, May 11, 2007 at 03:46:41AM -0600, Valerie Henson wrote: > On Wed, May 09, 2007 at 02:51:41PM -0500, Matt Mackall wrote: > > > > We will, unfortunately, need to be able to check an entire directory > > at once. There's no other efficient way to assure that there are no > > duplicate names in a directory, for instance. > > I don't see that being a major problem for the vast majority of > workloads. > > > In summary, checking a tile requires trivial checks on all the inodes > > and directories that point into a tile. Inodes, directories, and data > > that are inside a tile get checked more thoroughly but still don't > > need to do much pointer chasing. > > Okay, I'm totally convinced - checking a tile-at-a-time works! I'm > going to steal as many of your ideas as possible and write ChileFS. :) > Per-block inode rmap in particular has so many advantages that I'm > ranking it up with checksums as a must-have feature. > > Now for the hard part: repair. If you find an indirect block or > extent with a bad checksum, how much of the file system are you going > to have to read to fix the dangling blocks? I can see a speed-up by > reading just the rmaps and looking for the associated inode number. > What about an inode that has been corrupted? You could at least get > the inode number out of the rmap, but your pointers to your first > level indirect blocks are gone. A directory block? No way to get > useful information out of the rmap there. Ad nauseum. This is where > I really like having the encapsulation of chunkfs, despite all the > nasty continuation inode bits. There are a few interesting possibilities here. First we've got several new sources of integrity checking. If a block points to an inode and the inode doesn't point back to it, we follow the inode's corresponding pointer forward and back. If that turns out to be consistent, we know that the original block is in the wrong. We can also check the tile header and inode CRCs if they disagree and neither pointer checks out. Failing that, stray blocks can get attached to an inode in lost+found, and we can reattach them later during a full filesystem sweep. Another possibility is that we can compare forward and backward pointers at runtime. This is probably a good idea anyway: we've got to read tile headers for runtime CRC checks, so might as well check the pointers too. If we discover a pointer mismatch, we can either orphan data blocks -or recover orphans- on the fly. So doing a filesystem-wide backup will also do most of an fsck. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html