> On Nov 10, 2024, at 9:36 PM, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/11/11 8:52, cel@xxxxxxxxxx 写道: >> From: yangerkun <yangerkun@xxxxxxxxxx> >> [ Upstream commit 64a7ce76fb901bf9f9c36cf5d681328fc0fd4b5a ] >> After we switch tmpfs dir operations from simple_dir_operations to >> simple_offset_dir_operations, every rename happened will fill new dentry >> to dest dir's maple tree(&SHMEM_I(inode)->dir_offsets->mt) with a free >> key starting with octx->newx_offset, and then set newx_offset equals to >> free key + 1. This will lead to infinite readdir combine with rename >> happened at the same time, which fail generic/736 in xfstests(detail show >> as below). >> 1. create 5000 files(1 2 3...) under one dir >> 2. call readdir(man 3 readdir) once, and get one entry >> 3. rename(entry, "TEMPFILE"), then rename("TEMPFILE", entry) >> 4. loop 2~3, until readdir return nothing or we loop too many >> times(tmpfs break test with the second condition) >> We choose the same logic what commit 9b378f6ad48cf ("btrfs: fix infinite >> directory reads") to fix it, record the last_index when we open dir, and >> do not emit the entry which index >= last_index. The file->private_data > > Please notice this requires last_index should never overflow, otherwise > readdir will be messed up. It would help your cause if you could be more specific than "messed up". >> now used in offset dir can use directly to do this, and we also update >> the last_index when we llseek the dir file. >> Fixes: a2e459555c5f ("shmem: stable directory offsets") >> Signed-off-by: yangerkun <yangerkun@xxxxxxxxxx> >> Link: https://lore.kernel.org/r/20240731043835.1828697-1-yangerkun@xxxxxxxxxx >> Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >> [brauner: only update last_index after seek when offset is zero like Jan suggested] >> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> >> Link: https://nvd.nist.gov/vuln/detail/CVE-2024-46701 >> [ cel: adjusted to apply to origin/linux-6.6.y ] >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >> --- >> fs/libfs.c | 37 +++++++++++++++++++++++++------------ >> 1 file changed, 25 insertions(+), 12 deletions(-) >> diff --git a/fs/libfs.c b/fs/libfs.c >> index a87005c89534..b59ff0dfea1f 100644 >> --- a/fs/libfs.c >> +++ b/fs/libfs.c >> @@ -449,6 +449,14 @@ void simple_offset_destroy(struct offset_ctx *octx) >> xa_destroy(&octx->xa); >> } >> +static int offset_dir_open(struct inode *inode, struct file *file) >> +{ >> + struct offset_ctx *ctx = inode->i_op->get_offset_ctx(inode); >> + >> + file->private_data = (void *)ctx->next_offset; >> + return 0; >> +} > > Looks like xarray is still used. That's not going to change, as several folks have already explained. > I'm in the cc list ,so I assume you saw my set, then I don't know why > you're ignoring my concerns. > 1) next_offset is 32-bit and can overflow in a long-time running > machine. > 2) Once next_offset overflows, readdir will skip the files that offset > is bigger. In that case, that entry won't be visible via getdents(3) until the directory is re-opened or the process does an lseek(fd, 0, SEEK_SET). That is the proper and expected behavior. I suspect you will see exactly that behavior with ext4 and 32-bit directory offsets, for example. Does that not directly address your concern? Or do you mean that Erkun's patch introduces a new issue? If there is a problem here, please construct a reproducer against this patch set and post it. > Thanks, > Kuai > >> + >> /** >> * offset_dir_llseek - Advance the read position of a directory descriptor >> * @file: an open directory whose position is to be updated >> @@ -462,6 +470,9 @@ void simple_offset_destroy(struct offset_ctx *octx) >> */ >> static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) >> { >> + struct inode *inode = file->f_inode; >> + struct offset_ctx *ctx = inode->i_op->get_offset_ctx(inode); >> + >> switch (whence) { >> case SEEK_CUR: >> offset += file->f_pos; >> @@ -475,8 +486,9 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) >> } >> /* In this case, ->private_data is protected by f_pos_lock */ >> - file->private_data = NULL; >> - return vfs_setpos(file, offset, U32_MAX); >> + if (!offset) >> + file->private_data = (void *)ctx->next_offset; >> + return vfs_setpos(file, offset, LONG_MAX); >> } >> static struct dentry *offset_find_next(struct xa_state *xas) >> @@ -505,7 +517,7 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) >> inode->i_ino, fs_umode_to_dtype(inode->i_mode)); >> } >> -static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) >> +static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx, long last_index) >> { >> struct offset_ctx *so_ctx = inode->i_op->get_offset_ctx(inode); >> XA_STATE(xas, &so_ctx->xa, ctx->pos); >> @@ -514,17 +526,21 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) >> while (true) { >> dentry = offset_find_next(&xas); >> if (!dentry) >> - return ERR_PTR(-ENOENT); >> + return; >> + >> + if (dentry2offset(dentry) >= last_index) { >> + dput(dentry); >> + return; >> + } >> if (!offset_dir_emit(ctx, dentry)) { >> dput(dentry); >> - break; >> + return; >> } >> dput(dentry); >> ctx->pos = xas.xa_index + 1; >> } >> - return NULL; >> } >> /** >> @@ -551,22 +567,19 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) >> static int offset_readdir(struct file *file, struct dir_context *ctx) >> { >> struct dentry *dir = file->f_path.dentry; >> + long last_index = (long)file->private_data; >> lockdep_assert_held(&d_inode(dir)->i_rwsem); >> if (!dir_emit_dots(file, ctx)) >> return 0; >> - /* In this case, ->private_data is protected by f_pos_lock */ >> - if (ctx->pos == DIR_OFFSET_MIN) >> - file->private_data = NULL; >> - else if (file->private_data == ERR_PTR(-ENOENT)) >> - return 0; >> - file->private_data = offset_iterate_dir(d_inode(dir), ctx); >> + offset_iterate_dir(d_inode(dir), ctx, last_index); >> return 0; >> } >> const struct file_operations simple_offset_dir_operations = { >> + .open = offset_dir_open, >> .llseek = offset_dir_llseek, >> .iterate_shared = offset_readdir, >> .read = generic_read_dir, -- Chuck Lever