On May 1, 2017, at 12:58 PM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote: > > On Sat, Apr 29, 2017 at 08:59:53PM -0400, Theodore Ts'o wrote: >> On Thu, Mar 16, 2017 at 05:51:17AM -0400, Artem Blagodarenko wrote: >>> From: Artem Blagodarenko <artem.blagodarenko@xxxxxxxxx> >>> >>> This INCOMPAT_LARGEDIR feature allows larger directories to be created >>> in ldiskfs, both with directory sizes over 2GB and and a maximum htree >>> depth of 3 instead of the current limit of 2. These features are needed >>> in order to exceed the current limit of approximately 10M entries in a >>> single directory. >>> >>> Signed-off-by: Yang Sheng <yang.sheng@xxxxxxxxx> >>> Signed-off-by: Artem Blagodarenko <artem.blagodarenko@xxxxxxxxxxx> >> >> Thanks, applied. >> >> - Ted > > FYI, this patch is causing a problem when creating many files in a directory > (without the largedir feature enabled). I haven't looked into it but maybe it > will ring a bell for someone. Hmm, is this also a problem without the patch, when creating large numbers of entries in a directory? It looks like the directory index is clobbering the directory index block checksum, which is stored in struct ext2_dx_tail at the end of each index block. One possibility is that the patch is miscalculating the maximum number of entries in the index blocks when the metadata_csum feature is enabled? I don't think that feature is enabled by default with e2fsprogs yet, so it is entirely possible that these two features aren't quite playing nice together in the sandbox. That said, I looked through the patch again with this in mind, and I don't see anything related to the dx_limit. The dx_node_limit() function correctly takes the metadata_csum feature into account when calculating the limit of the htree entries in interior index blocks, so it isn't clear where this checksum error is coming from. It might be useful to print out the checksum values, just in case e.g. this is being checked on a block that was just zeroed out? Alternately, it might be that we are not initializing the checksum properly in the first place? Looking at it from this angle, I see what could be a problem. Can you try out the following (untested) patch (also attached in case of mangling)? I'm not totally sure it is correct, since the change from ext4_handle_dirty_metadata() to ext4_handle_dirty_dx_block() could potentially be ext4_handle_dirty_dirent_block(), but I _think_ the current change is right. There is also a separate question of whether we need to dirty ALL of the buffers in this code, compared to the pre-patch changes which had fewer calls to dirty buffers, but at least this patch should fix the checksum errors. Cheers, Andreas ---- diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index bd189c04..f0e8a6c 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -2309,21 +2309,23 @@ static int ext4_dx_add_entry(handle_t *handle, dx_insert_block((frame - 1), hash2, newblock); dxtrace(dx_show_index("node", frame->entries)); dxtrace(dx_show_index("node", - ((struct dx_node *) bh2->b_data)->entries)); + ((struct dx_node *)bh2->b_data)->entries)); err = ext4_handle_dirty_dx_node(handle, dir, bh2); if (err) goto journal_error; - brelse (bh2); - ext4_handle_dirty_metadata(handle, dir, - (frame - 1)->bh); + brelse(bh2); + err = ext4_handle_dirty_dx_node(handle, dir, + (frame - 1)->bh); + if (err) + goto journal_error; if (restart) { - ext4_handle_dirty_metadata(handle, dir, - frame->bh); - goto cleanup; + err = ext4_handle_dirty_dirty_dx_node(handle, + dir, frame->bh); + goto journal_error; } - } else { + } else /* add_level */ { struct dx_root *dxroot; - memcpy((char *) entries2, (char *) entries, + memcpy((char *)entries2, (char *)entries, icount * sizeof(struct dx_entry)); dx_set_limit(entries2, dx_node_limit(dir)); @@ -2335,15 +2337,13 @@ static int ext4_dx_add_entry(handle_t *handle, dxtrace(printk(KERN_DEBUG "Creating %d level index...\n", info->indirect_levels)); - ext4_handle_dirty_metadata(handle, dir, frame->bh); - ext4_handle_dirty_metadata(handle, dir, bh2); + err = ext4_handle_dirty_dx_node(handle, dir, frame->bh); + if (err) + goto journal_error; + err = ext4_handle_dirty_dx_node(handle, dir, bh2); brelse(bh2); restart = 1; - goto cleanup; - } - if (err) { - ext4_std_error(inode->i_sb, err); - goto cleanup; + goto journal_error; } } de = do_split(handle, dir, &bh, frame, &fname->hinfo); @@ -2355,7 +2355,7 @@ static int ext4_dx_add_entry(handle_t *handle, goto cleanup; journal_error: - ext4_std_error(dir->i_sb, err); + ext4_std_error(dir->i_sb, err); /* this is a no-op if err == 0 */ cleanup: brelse(bh); dx_release(frames); > seq -f "abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch > > [ 42.726480] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.729472] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.731689] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.734303] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.736383] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.739133] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.741307] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.743963] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > [ 42.745998] EXT4-fs error (device vdc): dx_probe:840: inode #2: block 508: comm touch: Directory index failed checksum > ... Cheers, Andreas
Attachment:
0001-ext4-fix-large_dir-interaction-with-metadata_csum.patch
Description: Binary data
Attachment:
signature.asc
Description: Message signed with OpenPGP