Re: Enabling h-trees too early?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Thu, Sep 20, 2007 at 06:19:04PM +0200, Jan Kara wrote:
> > if (EXT4_HAS_COMPAT_FEATURE(inode->i_sb, EXT4_FEATURE_COMPAT_DIR_INDEX) &&
> >     ((EXT4_I(inode)->i_flags & EXT4_INDEX_FL) ||
> >      ((inode->i_size >> sb->s_blocksize_bits) == 1))) {
> >         error = ext4_dx_readdir(filp, dirent, filldir);
> >         if (error != ERR_BAD_DX_DIR) {
> >                 ret = error;
> >                 goto out;
> >         }
> >         /*
> >          * We don't set the inode dirty flag since it's not
> >          * critical that it get flushed back to the disk.
> >          */
> >         EXT4_I(filp->f_path.dentry->d_inode)->i_flags &= ~EXT4_INDEX_FL;
> > }
> >   It calls ext4_dx_readdir() for *every* directory with 1 block (we have
> > 1326 of them in the kernel tree). Now ext4_dx_readdir() calls
> > ext4_htree_fill_tree() which finds out the directory is not h-tree and
> > and calls htree_dirblock_to_tree(). So even for 4KB directories we end up
> > deleting inodes in hash order! And as a bonus we burn some cycles building
> > trees etc. What is the point of this?
> 
> That was added so we wouldn't get screwed when a directory that was
> previously non htree became an htree directory while the directory fd
> is open.  So the failure case is one where you do opendir(), readdir()
> on 25% of the directory, sleep for 2 hours, and in the meantime, 200
> files are added to the directory and it gets converted into a htree
> index, causing all of the previously returned readdir() results in
> directory order to be completely screwed up now that the directory has
> been converted into an htree.  (All of the readdir/telldir/seekdir 
> POSIX requirements cause filesystem designers to tear their hair out.)
  Oh, yes. Thanks for explanation.

> What we would need to do to avoid needing this is to read in the
> entire directory leaf page into the rbtree, sorted by inode number,
> and then to keep that rbtree for the entire life of the open directory
> file descriptor.  We would also have to change telldir/seekdir to use
> something else as a telldir cookie, and readdir would have to be set
> up to *only* use the rbtree, and never look at the on-disk directory.
> This would also mean that all of the files created or deleted after
> the initial opendir() would never be reflected in results returned by
> readdir(), but that's allowed by POSIX.  And if we do this for a
> single block 4k directory, we might as well do it for a 32k or 64k
> HTREE directory as well.
  Yes, this makes sence...

								Honza 
-- 
Jan Kara <jack@xxxxxxx>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux