Hi Ted, On Sun, May 12, 2019 at 12:07 AM Theodore Ts'o <tytso@xxxxxxx> wrote: > On Sat, May 11, 2019 at 02:43:16PM +0200, Richard Weinberger wrote: > > [CC'in linux-ext4] > > > > On Sat, May 11, 2019 at 1:47 PM Arthur Marsh > > <arthur.marsh@xxxxxxxxxxxxxxxx> wrote: > > > > > > > > > The filesystem with the kernel source tree is the root file system, ext3, mounted as: > > > > > > /dev/sdb7 on / type ext3 (rw,relatime,errors=remount-ro) > > > > > > After the "Compressing objects" stage, the following appears in dmesg: > > > > > > [ 848.968550] EXT4-fs error (device sdb7): ext4_get_branch:171: inode #8: block 30343695: comm jbd2/sdb7-8: invalid block > > > [ 849.077426] Aborting journal on device sdb7-8. > > > [ 849.100963] EXT4-fs (sdb7): Remounting filesystem read-only > > > [ 849.100976] jbd2_journal_bmap: journal block not found at offset 989 on sdb7-8 > > This indicates that the extent tree blocks for the journal was found > to be corrupt; so the journal couldn't be found. > > > > # fsck -yv > > > fsck from util-linux 2.33.1 > > > e2fsck 1.45.0 (6-Mar-2019) > > > /dev/sdb7: recovering journal > > > /dev/sdb7 contains a file system with errors, check forced. > > But e2fsck had no problem finding the journal. > > > > Pass 1: Checking inodes, blocks, and sizes > > > Pass 2: Checking directory structure > > > Pass 3: Checking directory connectivity > > > Pass 4: Checking reference counts > > > Pass 5: Checking group summary information > > > Free blocks count wrong (4619656, counted=4619444). > > > Fix? yes > > > > > > Free inodes count wrong (15884075, counted=15884058). > > > Fix? yes > > And no other significant problems were found. (Ext4 never updates or > relies on the summary number of free blocks and free inodes, since > updating it is a scalability bottleneck and these values can be > calculated from the per block group free block/inodes count. So the > fact that e2fsck needed to update them is not an issue.) > > So that implies that we got one set of values when we read the journal > inode when attempting to mount the file system, and a *different* set > of values when e2fsck was run. Which makes means that we need > consider the possibility that the problem is below the file system > layer (e.g., the block layer, device drivers, etc.). > > > > > /dev/sdb7: ***** FILE SYSTEM WAS MODIFIED ***** > > > > > > Other times, I have gotten: > > > > > > "Inodes that were part of a corrupted orphan linked list found." > > > "Block bitmap differences:" > > > "Free blocks sound wrong for group" > > > > > This variety of issues also implies that the issue may be in the data > read by the file system, as opposed to an issue in the file system. > > Arthur, can you give us the full details of your hardware > configuration and your kernel config file? Also, what kernel git > commit ID were you testing? I'm seeing similar things running post v5.1 on ARAnyM (Atari emulator): EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem ... EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980: block 27550: comm jbd2/sda1-1980: invalid block and userspace hung somewhere during initial system startup, so I had to kill the instance. ----- EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem EXT4-fs (sda1): INFO: recovery required on readonly filesystem EXT4-fs (sda1): write access will be enabled during recovery EXT4-fs warning (device sda1): ext4_clear_journal_err:5078: Filesystem error recorded from previous mount: IO failure EXT4-fs warning (device sda1): ext4_clear_journal_err:5079: Marking fs in need of filesystem check. EXT4-fs (sda1): recovery complete EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) VFS: Mounted root (ext3 filesystem) readonly on device 8:1. ... Run /sbin/init as init process random: fast init done EXT4-fs (sda1): re-mounted. Opts: random: crng init done EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro EXT4-fs (sda1): error count since last fsck: 1 EXT4-fs (sda1): initial error at time 1557931133: ext4_get_branch:171: inode 1980: block 27550 EXT4-fs (sda1): last error at time 1557931133: ext4_get_branch:171: inode 1980: block 27550 ----- EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) VFS: Mounted root (ext3 filesystem) readonly on device 8:1. ... Run /sbin/init as init process random: fast init done EXT4-fs (sda1): re-mounted. Opts: EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro random: crng init done EXT4-fs error (device sda1): ext4_get_branch:171: inode #1980: block 27550: comm jbd2/sda1-1980: invalid block Aborting journal on device sda1-1980. EXT4-fs (sda1): Remounting filesystem read-only jbd2_journal_bmap: journal block not found at offset 426 on sda1-1980 EXT4-fs error (device sda1): ext4_journal_check_start:61: Detected aborted journal EXT4-fs (sda1): error count since last fsck: 3 EXT4-fs (sda1): initial error at time 1557931133: ext4_get_branch:171: inode 1980: block 27550 EXT4-fs (sda1): last error at time 1558083596: ext4_journal_check_start:61: inode 1980: block 27550 EXT4-fs error (device sda1): ext4_remount:5328: Abort forced by user --- EXT4-fs (sda1): mounting ext3 file system using the ext4 subsystem EXT4-fs (sda1): INFO: recovery required on readonly filesystem EXT4-fs (sda1): write access will be enabled during recovery random: fast init done EXT4-fs warning (device sda1): ext4_clear_journal_err:5078: Filesystem error recorded from previous mount: IO failure EXT4-fs warning (device sda1): ext4_clear_journal_err:5079: Marking fs in need of filesystem check. EXT4-fs (sda1): recovery complete EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) ... Run /sbin/init as init process random: crng init done EXT4-fs (sda1): re-mounted. Opts: EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro EXT4-fs (sda1): error count since last fsck: 4 EXT4-fs (sda1): initial error at time 1557931133: ext4_get_branch:171: inode 1980: block 27550 EXT4-fs (sda1): last error at time 1558083665: ext4_remount:5328: inode 1980: block 27550 Notes: - It's always the same block, - Block device is an image file, accessed using arch/m68k/emu/nfblock.c, which did not receive any recent (bvec) updates. - There are no reported errors for the device containing the image file on the host, - Given Arthur sees the issue on a different class of machines, it's unlikely the issue is related to a problem with the block device (driver). It may still be an issue with the block layer, though, - Both Arthur and I are mounting an ext3 file system using the ext4 subsystem. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds