On Thu, Dec 22, 2022 at 10:02:44AM +0800, Jun Nie wrote: > There is case that s_first_data_block is not 0 and block nr is smaller than > s_first_data_block when calculating group bitmap during allocation. This > underflow make index exceed es->s_groups_count in ext4_get_group_info() > and trigger the BUG_ON. > > Fix it with protection of underflow. When was this happening, and why? If blocknr is less than s_first_data_block, this is either a insufficient input validation, insufficient validation to detection file system corruption. or some other kernel bug. Looking quickly at the code and the repro, it appears that issue is that FS_IOC_GETFSMAP is getting passed a stating physical block of 0 in fmh_keys[0] when on a file system with a blocksize of 1k (in which case s_first_data_block is 1). It's unclear to me what FS_IOC_GETFSMAP should *do* when passed a value which requests that it provide a mapping for a block which is out of bounds (either too big, or too small)?. Should it return an error? Should it simply not return a mapping? The map page for ioctl_getfsmap() doesn't shed any light on this question. Darrick, you designed the interface and wrote most of fs/ext4/fsmap.c. Can you let us know what is supposed to happen in this case? Many thanks!! > Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()") This makes ***no*** sense; the commit in question is from 2006, which means that in some jourisdictions it's old enough to drive a car. :-) Futhermore, all it does is move the function from an inline function to a C file (in this case, balloc.c). It also long predates introduction of FS_IOC_GETFSMAP support, which was in 2017. I'm guessing you just did a "git blame" and blindly assumed that whatever commit last touched the C code in question was what introduced the problem? Anyway, please try to understand what is going on instead of doing the moral equivalent of taking a sledgehammer to the code until the reproducer stops triggering a BUG. It's not enough to shut up the reproducer; you should understand what is happening, and why, and then strive to find the best fix to the problem. Papering over problems in the end will result in more fragile code, and the goal of syzkaller is to improve kernel quality. But syzkaller is just a tool and used wrongly, it can have the opposite effect. Regards, - Ted