On Mar 13, 2008 13:20 -0400, Theodore Ts'o wrote: > I was just auditing the e2fsprogs nlinks-dir patch, and I think there > are some problems with it. First of all, it creates a whole new set of > icount abstractions so it can count using a 32-bit count. That's > strictly not necessary, since all e2fsck cares about is whether the > count is bigger than EXT2_LINK_MAX, or 65,000. One of the reasons for this change is the ability to actually count nlinks > 65536 - the old code would just blindly wrap if this ever happened. It might also be useful in the future if we rejig the 256-byte inode to have a 32-bit i_nlink count. There was some other bit of code we were working on that needed a 32-bit icount, but I can't recall what it is right now. Maybe Girish or Kalpak will recall? > The bigger problem is that it fetches the actual number of > subdirectories (which could be some very large number, like 65538, or > some number much larger than 2**16): > > ext2fs_icount_fetch32(ctx->inode_count, i, &link_counted); > > It then tests to see if the on-disk count is OK, which it does this way: > > if (link_counted != link_count && > !(ext2fs_test_inode_bitmap(ctx->inode_dir_map, i) && > link_count == 1 && link_counted > EXT2_LINK_MAX)) { > > So if it is the actual and on-disk count is different, but the inode in > question is an inode and the on-disk count is 1, but the actual count is ^^^^^ directory? > greater than EXT2_LINK_MAX, it's OK. Otherwise, we need to Take Action. > > OK, but what if the on-disk count is some number less than EXT2_LINK_MAX > --- say, 64,999? And what if the actual number of subdiretories is > somewhat larger and greater than EXT2_LINK_MAX, say, like 65538? Well, > then the code blindly assignens the 32-bit link_counted to the 16-bit > on-disk i_links_count field, and writes it to disk: > > inode->i_links_count = link_counted; > e2fsck_write_inode(ctx, i, inode, "pass4"); > > The number 65538 will get masked down to 2. Hilarity then ensues. Can you expand? I tested the kernel and a link count of 2 will still result in the "ext3_is_empty()" function being called prior to doing the unlink and it will be refused. If a subdirectory is removed at that point nlink will go down to 1 again and all is well? > + * We now use a 32-bit counter field because it doesn't cost us > + * anything extra for the in-memory data structure, due to alignment > + * padding. But there's no point changing the interface if most of > + * the time we only care if the number is bigger than 65,000 or not. > + * So use the following translation function to return a 16-bit count. > + */ > +#define icount_16_xlate(x) (((x) > 65500) ? 65500 : (x)) This hack does make the patch a lot less intrusive... I don't have any objections. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html