Re: BUG: 6.10: ext4 mpage_process_page_bufs() BUG_ON triggers

"Russell King (Oracle)" <linux@xxxxxxxxxxxxxxx> · Wed, 4 Sep 2024 20:50:51 +0100

On Wed, Sep 04, 2024 at 08:30:43PM +0100, Russell King (Oracle) wrote:
> On Wed, Sep 04, 2024 at 07:47:33PM +0100, Russell King (Oracle) wrote:
> > With a 6.10 based kernel, no changes to filesystem/MM code, I'm
> > seeing a reliable BUG_ON() within minutes of booting on one of my
> > VMs. I don't have a complete oops dump, but this is what I do
> > have, cobbled together from what was logged by journald, and
> > what syslogd was able to splat on the terminals before the VM
> > died.
> > 
> > Sep 04 15:51:46 lists kernel: kernel BUG at fs/ext4/inode.c:1967!
> > 
> > [ 1346.494848] Call trace:
> > [ 1346.495409] [<c04b4f90>] (mpage_process_page_bufs) from [<c04b938c>] (mpage_prepare_extent_to_map+0x410/0x51c)
> > [ 1346.499202] [<c04b938c>] (mpage_prepare_extent_to_map) from [<c04bbc40>] (ext4_do_writepages+0x320/0xb94)
> > [ 1346.502113] [<c04bbc40>] (ext4_do_writepages) from [<c04bc5dc>] (ext4_writepages+0xc0/0x1b4)
> > [ 1346.504662] [<c04bc5dc>] (ext4_writepages) from [<c0361154>] (do_writepages+0x68/0x220)
> > [ 1346.506974] [<c0361154>] (do_writepages) from [<c0354868>] (filemap_fdatawrite_wbc+0x64/0x84)
> > [ 1346.509165] [<c0354868>] (filemap_fdatawrite_wbc) from [<c035706c>] (__filemap_fdatawrite_range+0x50/0x58)
> > [ 1346.511414] [<c035706c>] (__filemap_fdatawrite_range) from [<c035709c>] (filemap_flush+0x28/0x30)
> > [ 1346.513518] [<c035709c>] (filemap_flush) from [<c04a8834>] (ext4_release_file+0x70/0xac)
> > [ 1346.515312] [<c04a8834>] (ext4_release_file) from [<c03f8088>] (__fput+0xd4/0x2cc)
> > [ 1346.517219] [<c03f8088>] (__fput) from [<c03f3e64>] (sys_close+0x28/0x5c)
> > [ 1346.518720] [<c03f3e64>] (sys_close) from [<c0200060>] (ret_fast_syscall+0x0/0x5c)
> > 
> > From a quick look, I don't see any patches that touch fs/ext4/inode.c
> > that might address this.
> > 
> > I'm not able to do any debugging, and from Friday, I suspect I won't
> > even be able to use a computer (due to operations on my eyes.)
> 
> After rebooting the VM, the next oops was:
> 
> Sep 04 19:33:41 lists kernel: Unable to handle kernel paging request at virtual address 5ed304f3 when read
> Sep 04 19:33:42 lists kernel: [5ed304f3] *pgd=80000040005003, *pmd=00000000
> Sep 04 19:33:42 lists kernel: Internal error: Oops: 206 [#1] PREEMPT SMP ARM
> 
>  kernel:[  205.583038] Internal error: Oops: 206 [#1] PREEMPT SMP ARM
>  kernel:[  205.630530] Process kworker/u4:2 (pid: 33, stack limit = 0xc68f8000)
> ...
>  kernel:[  205.661017] Call trace:
>  kernel:[  205.661997] [<c04d9060>] (ext4_finish_bio) from [<c04d931c>] (ext4_release_io_end+0x48/0xfc)
>  kernel:[  205.664523] [<c04d931c>] (ext4_release_io_end) from [<c04d94d8>] (ext4_end_io_rsv_work+0x88/0x188)
>  kernel:[  205.666628] [<c04d94d8>] (ext4_end_io_rsv_work) from [<c023f310>] (process_one_work+0x178/0x30c)
>  kernel:[  205.669924] [<c023f310>] (process_one_work) from [<c023fe48>] (worker_thread+0x25c/0x438)
>  kernel:[  205.671679] [<c023fe48>] (worker_thread) from [<c02480b0>] (kthread+0xfc/0x12c)
>  kernel:[  205.673607] [<c02480b0>] (kthread) from [<c020015c>] (ret_from_fork+0x14/0x38)
>  kernel:[  205.682719] Code: e1540005 0a00000d e5941008 e594201c (e5913000)
> 
> This corresponds with:
> 
> c04d9050:       e1540005        cmp     r4, r5
> c04d9054:       0a00000d        beq     c04d9090 <ext4_finish_bio+0x208>
> c04d9058:       e5941008        ldr     r1, [r4, #8]
> c04d905c:       e594201c        ldr     r2, [r4, #28]
> c04d9060:       e5913000        ldr     r3, [r1] ;<<<==== faulting instruction
> 
> This code is:
> 
>                 /*
>                  * We check all buffers in the folio under b_uptodate_lock
>                  * to avoid races with other end io clearing async_write flags
>                  */
>                 spin_lock_irqsave(&head->b_uptodate_lock, flags);
>                 do {
>                         if (bh_offset(bh) < bio_start ||
>                             bh_offset(bh) + bh->b_size > bio_end) {
> 
> where r4 is "bh", r4+8 is the location of the bh->b_page pointer.
> 
> static inline unsigned long bh_offset(const struct buffer_head *bh)
> {
>         return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
> }
> 
> static inline unsigned long compound_nr(struct page *page)
> {
>         struct folio *folio = (struct folio *)page;
> 
>         if (!test_bit(PG_head, &folio->flags))
>                 return 1;
> 
> where PG_head is bit 6. Thus, bh->b_page was corrupt.
> 
> I've booted back into 6.7 on the offending VM, and it's stable, so it
> appears to be a regression between 6.7 and 6.10.

So.. I decided to stop the VM, and run e2fsck, first not allowing it
to fix stuff. It identified only one error:

e2fsck 1.46.2 (28-Feb-2021)
Pass 1: Checking inodes, blocks, and sizes
Inode 657, i_blocks is 8, should be 0.  Fix<y>? no
Inode 657 passes checks, but checksum does not match inode.  Fix<y>? no
Pass 2: Checking directory structure
ext2fs_read_inode: Inode checksum does not match inode while reading
inode 657 in process_bad_inode

/dev/loop0p1: ********** WARNING: Filesystem still has errors **********

e2fsck: aborted

/dev/loop0p1: ********** WARNING: Filesystem still has errors **********

Booting the VM again and tracking that down (debugfs refused to do the
inode number->name translation for some absurd reason... surely the
inode checksum doesn't _need_ to be correct to find which directory
entry has the inode number?)

Interestingly, when I hit the bad inode:

vdir: cannot access '/var/lib/dpkg/info/libacl1:armhf.triggers': Bad message
vdir: '/var/lib/dpkg/info/libgdbm-compat4:armhf.list': Input/output error
vdir: '/var/lib/dpkg/info/libldap-2.5-0:armhf.shlibs': Input/output error

and any further accesses to the filesystem gave -EIO! As it's the root
filesystem, no further commands could be run, not even those which
I'd just run. The only solution was to destroy the VM via libvirt.
This, despite the filesystem mounted:

/dev/vda1 on / type ext4 (rw,relatime,errors=remount-ro)

So why it started spitting out -EIO errors I've no idea. That seems
utter madness.

Anyway, I re-ran e2fsck, fixing the errors:

e2fsck 1.46.2 (28-Feb-2021)
/dev/loop0p1: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Inode 657, i_blocks is 8, should be 0.  Fix<y>? yes
Pass 2: Checking directory structure
Inode 657 (/var/lib/dpkg/info/libacl1:armhf.triggers) has invalid mode
(0171223).
Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -88667
Fix<y>? yes
Free blocks count wrong for group #2 (21314, counted=21315).
Fix<y>? yes
Free blocks count wrong (862994, counted=862964).
Fix<y>? yes
Free inodes count wrong (292360, counted=292352).
Fix<y>? yes

/dev/loop0p1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/loop0p1: 38272/330624 files (0.4% non-contiguous), 459276/1322240 blocks

before restarting the VM back into 6.7.0.

-- 
*** please note that I probably will only be occasionally responsive
*** for an unknown period of time due to recent eye surgery making
*** reading quite difficult.

RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!