On Mon, Sep 12, 2022 at 11:31:54AM +1000, Dave Chinner wrote: > On Wed, Aug 31, 2022 at 08:16:39PM +0800, Guo Xuenan wrote: > > For leaf dir, In most cases, there should be as many bestfree slots > > as the dir data blocks that can fit under i_size (except for [1]). > > > > Root cause is we don't examin the number bestfree slots, when the slots > > number less than dir data blocks, if we need to allocate new dir data > > block and update the bestfree array, we will use the dir block number as > > index to assign bestfree array, while we did not check the leaf buf > > boundary which may cause UAF or other memory access problem. This issue > > can also triggered with test cases xfs/473 from fstests. > > > > Considering the special case [1] , only add check bestfree array boundary, > > to avoid UAF or slab-out-of bound. > > > > [1] https://lore.kernel.org/all/163961697197.3129691.1911552605195534271.stgit@magnolia/ > > > > Simplify the testcase xfs/473 with commands below: > > DEV=/dev/sdb > > MP=/mnt/sdb > > WORKDIR=/mnt/sdb/341 #1. mkfs create new xfs image > > mkfs.xfs -f ${DEV} > > mount ${DEV} ${MP} > > mkdir -p ${WORKDIR} > > for i in `seq 1 341` #2. create leaf dir with 341 entries file name len 8 > > do > > touch ${WORKDIR}/$(printf "%08d" ${i}) > > done > > inode=$(ls -i ${MP} | cut -d' ' -f1) > > umount ${MP} #3. xfs_db set bestcount to 0 > > xfs_db -x ${DEV} -c "inode ${inode}" -c "dblock 8388608" \ > > -c "write ltail.bestcount 0" > > mount ${DEV} ${MP} > > touch ${WORKDIR}/{1..100}.txt #4. touch new file, reproduce > ..... > > diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c > > index d9b66306a9a7..4b2a72b3a6f3 100644 > > --- a/fs/xfs/libxfs/xfs_dir2_leaf.c > > +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c > > @@ -819,6 +819,18 @@ xfs_dir2_leaf_addname( > > */ > > else > > xfs_dir3_leaf_log_bests(args, lbp, use_block, use_block); > > + /* > > + * An abnormal corner case, bestfree count less than data > > + * blocks, add a condition to avoid UAF or slab-out-of bound. > > + */ > > + if ((char *)(&bestsp[use_block]) > (char *)ltp) { Aha, so this /can/ be detected by walking off the end of the buffer... > > + xfs_trans_brelse(tp, lbp); > > + if (tp->t_flags & XFS_TRANS_DIRTY) > > + xfs_force_shutdown(tp->t_mountp, > > + SHUTDOWN_CORRUPT_ONDISK); > > + return -EFSCORRUPTED; > > + } > > + > > As I explained here: > > https://lore.kernel.org/linux-xfs/20220829081244.GT3600936@xxxxxxxxxxxxxxxxxxx/ > > We don't check for overruns caused by on-disk format corruptions in > every operation - we catch the corruption when the metadata is first > read into memory via the verifier. > > Please add a check for a corrupt/mismatched best sizes array to > xfs_dir3_leaf_check_int() so that the corruption is detected on > first read and the internal directory code is never exposed to such > issues. ...in which case this should go in the buffer verifier. Seconded. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx