Hello, is anybody have advice? Thanks, Yi. On 2022/11/10 10:25, Zhang Yi wrote: > Changes since v1: > - Fix format error in ext4_fault_ops_write(). > > Now we can test ext4's reliability by simulating fail facility introduced > in commit 46f870d690fe ("ext4: simulate various I/O and checksum errors > when reading metadata"), it can simulate checksum error or I/O error > when reading metadata from disk. But it is functional limited, it cannot > set failure times, probability, filters, etc. Fortunately, we already > have common fault-injection frame in Linux, so above limitation could be > easily supplied by using it in ext4. This patch set add ext4 > fault-injection facility to replace the old frame, supply some kinds of > checksum error and I/O error, and also add group, inode, physical > block and inode logical block filters. After this patch set, we could > inject failure more precisly. The facility could be used to do fuzz > stress test include random errors, and it also could be used to > reprodece issues more conveniently. > > Patch 1: add debugfs for preparing. > Patch 2: introduce the fault-injection frame for ext4. > Patch 3-11: add various kinds of faults and also do some cleanup. > Patch 12: remove the old simulating facility. > > It provides a debugfs interface in ext4/<disk>/fault_inject, besides the > common config interfaces, we give 6 more. > - available_faults: present available faults we can inject. > - inject_faults: set faults, can set multiple at a time. > - inject_inode: set the inode filter, matches all inodes if not set. > - inject_group: set the block group filter, similar to inject_inode. > - inject_logical_block: set the logical block filter for one inode. > - inject_physical_block: set the physical block filter. > > Current we add 20 available faults list below, include 8 kinds of > metadata checksum error, 7 metadata I/O error and 5 journal error. > After we have this facility, more other faults could be added easily > in the future. > - group_desc_checksum > - inode_bitmap_checksum > - block_bitmap_checksum > - inode_checksum > - extent_block_checksum > - dir_block_checksum > - dir_index_block_checksum > - xattr_block_checksum > - inode_bitmap_eio > - block_bitmap_eio > - inode_eio > - extent_block_eio > - dir_block_eio > - xattr_block_eio > - symlink_block_eio > - journal_start > - journal_start_sb > - journal_get_create_access > - journal_get_write_access > - journal_dirty_metadata > > For example: inject inode metadata checksum error on file 'foo'. > > $ mkfs.ext4 -F /dev/pmem0 > $ mount /dev/pmem0 /mnt > $ mkdir /mnt/dir > $ touch /mnt/dir/foo > $ ls -i /mnt/dir/foo > 262146 /mnt/foo > > $ echo 100 > /sys/kernel/debug/ext4/pmem0/fault_inject/probability > $ echo 1 > /sys/kernel/debug/ext4/pmem0/fault_inject/times > $ echo 262146 > /sys/kernel/debug/ext4/pmem0/fault_inject/inject_inode > $ echo inode_checksum > /sys/kernel/debug/ext4/pmem0/fault_inject/inject_faults > $ echo 1 > /sys/kernel/debug/ext4/pmem0/fault_inject/enable > $ echo 3 > /proc/sys/vm/drop_caches ##drop cache > $ stat /mnt/dir/foo > stat: cannot statx '/mnt/dir/foo': Bad message > > The kmesg print the injection location. > > [ 461.433817] FAULT_INJECTION: forcing a failure. > [ 461.433817] name fault_inject, interval 1, probability 100, space 0, times 1 > ... > [ 461.438609] Call Trace: > [ 461.438875] <TASK> > [ 461.439116] ? dump_stack_lvl+0x73/0xa3 > [ 461.439534] ? dump_stack+0x13/0x1f > [ 461.439909] ? should_fail.cold+0x4a/0x57 > [ 461.440346] ? ext4_should_fail.cold+0x11f/0x135 > [ 461.440833] ? __ext4_iget+0x407/0x1410 > [ 461.441245] ? ext4_lookup+0x1be/0x350 > [ 461.441650] ? __lookup_slow+0xb9/0x1f0 > [ 461.442070] ? lookup_slow+0x46/0x70 > [ 461.442463] ? walk_component+0x13e/0x230 > [ 461.442890] ? path_lookupat.isra.0+0x8f/0x200 > [ 461.443369] ? filename_lookup+0xd6/0x240 > [ 461.443798] ? vfs_statx+0xa6/0x200 > [ 461.444186] ? do_statx+0x48/0xc0 > [ 461.444546] ? __might_sleep+0x56/0xc0 > [ 461.444950] ? should_fail_usercopy+0x19/0x30 > [ 461.445424] ? strncpy_from_user+0x33/0x2a0 > [ 461.445870] ? getname_flags+0x95/0x330 > [ 461.446288] ? switch_fpu_return+0x27/0x1e0 > [ 461.446736] ? __x64_sys_statx+0x90/0xd0 > [ 461.447160] ? do_syscall_64+0x3b/0x90 > [ 461.447563] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd > [ 461.448122] </TASK> > [ 461.448395] EXT4-fs error (device pmem0): ext4_lookup:1840: inode #262146: comm stat: iget: checksum invalid > > Thanks, > Yi. > > > Zhang Yi (12): > ext4: add debugfs interface > ext4: introduce fault injection facility > ext4: add several checksum fault injection > ext4: add bitmaps I/O fault injection > ext4: add inode I/O fault injection > ext4: add extent block I/O fault injection > ext4: add dirblock I/O fault injection > ext4: call ext4_xattr_get_block() when getting xattr block > ext4: add xattr block I/O fault injection > ext4: add symlink block I/O fault injection > ext4: add journal related fault injection > ext4: remove simulate fail facility > > fs/ext4/Kconfig | 9 ++ > fs/ext4/balloc.c | 14 ++- > fs/ext4/bitmap.c | 4 + > fs/ext4/dir.c | 3 + > fs/ext4/ext4.h | 181 +++++++++++++++++++++++++++++-------- > fs/ext4/ext4_jbd2.c | 22 +++-- > fs/ext4/ext4_jbd2.h | 5 + > fs/ext4/extents.c | 7 ++ > fs/ext4/ialloc.c | 24 +++-- > fs/ext4/inode.c | 26 ++++-- > fs/ext4/namei.c | 14 ++- > fs/ext4/super.c | 7 +- > fs/ext4/symlink.c | 4 + > fs/ext4/sysfs.c | 183 +++++++++++++++++++++++++++++++++++-- > fs/ext4/xattr.c | 216 +++++++++++++++++++------------------------- > 15 files changed, 515 insertions(+), 204 deletions(-) >