On Jan 23, 2019, at 11:32 AM, harshadshirwadkar@xxxxxxxxx wrote: > > From: Harshad Shirwadkar <harshadshirwadkar@xxxxxxxxx> > > This patch is the first step towards shrinking htree based directories > when files are deleted. We truncate directory inode when a directory > entry removal causes last directory block to be empty. This patch just > removes the last block. We may end up in a situation where many > intermediate dirent blocks in directory inode are empty. Removing > those blocks would be handled later. > > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@xxxxxxxxx> > --- > fs/ext4/namei.c | 130 ++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 109 insertions(+), 21 deletions(-) > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > index 2a4c25c4681d..79cb88ba898a 100644 > --- a/fs/ext4/namei.c > +++ b/fs/ext4/namei.c > @@ -273,7 +273,8 @@ static int ext4_htree_next_block(struct inode *dir, __u32 hash, > __u32 *start_hash); > static struct buffer_head * ext4_dx_find_entry(struct inode *dir, > struct ext4_filename *fname, > - struct ext4_dir_entry_2 **res_dir); > + struct ext4_dir_entry_2 **res_dir, > + struct dx_frame *dx_frame); (style) these should align after '(' on the first line > @@ -380,6 +381,7 @@ static void ext4_dirent_csum_set(struct inode *inode, > (void *)t - (void *)dirent); > } > > + > int ext4_handle_dirty_dirent_node(handle_t *handle, > struct inode *inode, > struct buffer_head *bh) (style) No need for this hunk. > @@ -866,6 +868,42 @@ dx_probe(struct ext4_filename *fname, struct inode *dir, > return ret_err; > } > > +static void > +ext4_dx_delete_entry(handle_t *handle, struct inode *dir, > + struct dx_frame *dx_frame, __le64 block) > +{ > + struct dx_entry *entries; > + unsigned int count, limit; > + int err, i; > + > + entries = dx_frame->entries; > + count = dx_get_count(entries); > + limit = dx_get_limit(entries); > + > + for (i = 0; i < count; i++) > + if (entries[i].block == block) > + break; > + > + if (i >= count) > + return; > + > + err = ext4_journal_get_write_access(handle, dx_frame->bh); > + if (err) { > + ext4_std_error(dir->i_sb, err); > + return; > + } > + > + for (; i < count - 1; i++) > + entries[i] = entries[i + 1]; > + > + dx_set_count(entries, count - 1); > + dx_set_limit(entries, limit); I don't think you need to update the limit just because of the count changing? This is just setting it back to the same value that dx_get_limit() returned earlier. > @@ -1309,6 +1347,14 @@ > +static int is_empty_dirent_block(struct inode *dir, struct buffer_head *bh) (style) this function should be type bool. The older "is_*" functions were written before "bool" was available in the kernel, though a separate cleanup patch to convert them to bool would be welcome. > +{ > + struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *)bh->b_data; > + > + return (ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize) > + == dir->i_sb->s_blocksize); (style) no need for () around return value. > > @@ -1339,7 +1385,8 @@ static int is_dx_internal_node(struct inode *dir, ext4_lblk_t block, > static struct buffer_head * ext4_find_entry (struct inode *dir, > const struct qstr *d_name, > struct ext4_dir_entry_2 **res_dir, > - int *inlined) > + int *inlined, > + struct dx_frame *dx_frame) (style) this can fit on the previous line, and they can all align after '(' on the first line. Please also remove the space after '*' in the function declaration. > @@ -1486,9 +1533,10 @@ static struct buffer_head * ext4_find_entry (struct inode *dir, > return ret; > } > > -static struct buffer_head * ext4_dx_find_entry(struct inode *dir, > - struct ext4_filename *fname, > - struct ext4_dir_entry_2 **res_dir) > +static struct buffer_head *ext4_dx_find_entry( > + struct inode *dir, struct ext4_filename *fname, (style) it isn't clear why you moved "struct inode *dir" from the previous line? The continued lines look like they can properly align after '(' on the previous line. > @@ -2337,9 +2389,13 @@ int ext4_generic_delete_entry(handle_t *handle, > static int ext4_delete_entry(handle_t *handle, > struct inode *dir, > struct ext4_dir_entry_2 *de_del, > - struct buffer_head *bh) > + struct buffer_head *bh, > + struct dx_frame *dx_frame) > { > + struct ext4_map_blocks map; > int err, csum_size = 0; > + int should_truncate = 0; (style) bool should_truncate > @@ -2363,11 +2419,35 @@ static int ext4_delete_entry(handle_t *handle, > if (err) > goto out; > > + if (dx_frame && dx_frame->bh && is_empty_dirent_block(dir, bh)) { > + (style) no blank line here. > + map.m_lblk = (dir->i_size - 1) >> > + (dir->i_sb->s_blocksize_bits); (style) no need for parenthesis around dir->i_sb->s_blocksize_bits. (style) this can fit on the previous line and still be under 80 columns. > + map.m_len = 1; > + err = ext4_map_blocks(handle, dir, &map, 0); > + > + if ((err > 0) && (bh->b_blocknr == map.m_pblk)) { (style) no need for extra parenthesis around "err > 0" and "b_blocknr == m_pblk". > + ext4_dx_delete_entry(handle, dir, dx_frame, > + cpu_to_le64(map.m_lblk)); > + should_truncate = 1; > + } > + } > + > BUFFER_TRACE(bh, "call ext4_handle_dirty_metadata"); > + > err = ext4_handle_dirty_dirent_node(handle, dir, bh); > if (unlikely(err)) > goto out; > > + if (should_truncate) { > + dir->i_size -= dir->i_sb->s_blocksize; > + ext4_truncate(dir); Since we have already mapped the last block above, I wonder if we can do something lighter weight than calling ext4_truncate() here? Maybe moving the ext4_truncate() guts into a helper function like ext4_truncate_internal() that avoids all of the complexity (orphan list, resizing the transaction, inline data, etc). > + if (dir->i_size == dir->i_sb->s_blocksize) { > + ext4_clear_inode_flag(dir, EXT4_INODE_INDEX); > + ext4_mark_inode_dirty(handle, dir); Are there any places in the code that assume an indexed directory will remain as such? The tricky part here is that any valid htree directory will have 3 blocks (dx_root, plus two leaf blocks), so it might be problematic to go back to a single non-htree block, especially since this doesn't look like it has any locking. It is probably worthwhile to check the code for this. > @@ -2985,6 +3065,9 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry) > struct buffer_head *bh; > struct ext4_dir_entry_2 *de; > handle_t *handle = NULL; > + struct dx_frame dx_frame; > + > + memset(&dx_frame, 0, sizeof(dx_frame)); Rather than an explicit memset(), this can use a struct initializer when it is declared: struct dx_frame dx_frame = { NULL }; Are there existing xfstests that cover this case? Certainly deleting files from a directory, but it would be useful to have something that checks whether the directory size is shrinking and this code is working. Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP