Re: Corrupted i_blocks field

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-10-01 00:13, Jesper Dybdal wrote:
On 2024-09-30 22:29, Andreas Dilger wrote:
On Sep 27, 2024, at 8:38 AM, Jesper Dybdal<jd-ext4@xxxxxxxxx> wrote:
I have now a few times experienced a problem with the i_blocks field of a few inodes being corrupted (replaced by extremely large numbers).

I don't believe that it is a disk error - the file system is on a RAID1 partition and the RAID consistency is checked regularly.
I also find it hard to believe that it is a RAM error - the machine has run memtest86+ overnight without finding anything.

The files I've seen corrupted are simple small text files that are modified only using an ordinary text editor (emacs).

Fsck fixes it.
The system is an up-to-date Debian Bookworm:
     Linux nuser 6.1.0-25-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.106-3 (2024-08-26) x86_64 GNU/Linux

I do one thing that is not the default for ext4: I use the "nodelalloc" option (because several years ago, there was a discussion about "delalloc or not" from which I got the impression that nodelalloc was probably slightly safer - if the resulting performance reduction is not a problem, which it is not for me):
      /dev/md0 on / type ext4 (rw,relatime,nodelalloc,errors=remount-ro)

Three examples follow below.  Note that the bad field values, when interpreted as 48-bit signed numbers, are numerically small negative numbers (-25, -9, -3, respectively).

Excerpts from the fsck logs:
root: Inode 10748715, i_blocks is 281474976710631, should be 5. FIXED.
root: Inode 10751288, i_blocks is 281474976710647, should be 3. FIXED.
root: Inode 10748542, i_blocks is 281474976710653, should be 1. FIXED.

I don't know when the first two of these corruptions occurred, but the last one happened yesterday or the day before.  The file in question was /etc/fstab, and I discovered the problem after I had edited fstab on Wednesday and rebooted on Thursday.

The corrupted files can be read and copied without problems.  I have not dared to delete any of those files before fsck had fixed them.

What is going on here?
This looks like an underflow of the used blocks count on the inode:

     281474976710631 = 0xffffffffffe7
     281474976710647 = 0xfffffffffff7
     281474976710653 = 0xfffffffffffd

This is 2^48 blocks, which is the limit for the number of blocks that fit
into the available inode fields (32-bit i_blocks_lo, 16-bit i_blocks_hi).

There is likely some kind of accounting error in the code.  Is anything
unusual with access patterns for those files (large xattrs/ACLs, are they
files or directories or special files. mmap, truncate, fallocate, etc.)?
No.  They are all simple small text configuration files, and I edit them using Emacs.  The only slightly unusual thing is, as I wrote earlier, that the file system is mounted with the nodelalloc option.

The files I have identified are fstab and  two postfix configuration files: /etc/postfix/{main.cf,master.cf} .  The problem has actually hit master.cf twice.

I have verified that the only reboot that happened between the fstab edit on Wednesday and  seeing the problem Thursday, was a clean deliberate reboot - no power outage of similar.
If you are able to reproduce with the /etc/fstab editing, possibly strace
could help to identify if something unusual is being done to the file.
I'll try, but I do not really expect Emacs to do strange things to the file

It happened again, and this time it affected no less than 30 files. And this time, the bad i_blocks values (when interpreted as a signed number) were not all negative.  Also, this time it did not affect only very small files.  There is a list at the end of this message.

It happened early this morning when the Debian "unattended upgrade" functionality upgraded the system to Debian 12.8 and automatically rebooted.  So it seems that it happens in connection with reboots (automatic or manual), and mostly affects files that have been modified recently - the earliest modification time was October 18. I had some time ago turned delalloc on in order to use settings that most people use, so it should be a perfectly normal ext4 file system with default settings.

For some reason it only affects files on the md0 file system - there is also an md1 on the same disks which seems to have no such problems.

I have now set the mount count of the md0 partition to 1, so it will be checked on every boot.

But I would really appreciate it if somebody who knows the ext4 code could explain what is happening - and tell me whether these incorrect i_blocks values are dangerous.

The kernel versions before and after the upgrade were:
6.1.0-26-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux 6.1.0-27-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.115-1 (2024-11-01) x86_64 GNU/Linux

Thanks,
Jesper

Here is a list of the 30 files, with fsck log lines merged with info about the files:

root: Inode 10748775, i_blocks is 281474976710654, should be 1. FIXED.
 10748775      4 -rw-r--r--   1 root     root         2238 Oct 24 23:56 /etc/fail2ban/jail.local

root: Inode 10749866, i_blocks is 281474976710628, should be 1. FIXED.
 10749866      4 -rw-r--r--   1 root     root            5 Nov 10 06:41 /etc/letsencrypt/webroot/.well-known/acme-challenge/test

root: Inode 10750308, i_blocks is 281474976710651, should be 5. FIXED.
 10750308     20 -rw-r--r--   1 root     root        17145 Oct 18 12:30 /etc/postfix/main.cf

root: Inode 10750322, i_blocks is 0, should be 24.  FIXED.
 10750322     12 -rw-r--r--   1 root     root        10782 Oct 17 23:57 /etc/postfix/master.cf

root: Inode 10751867, i_blocks is 281474976710652, should be 1. FIXED.
 10751867      4 -rw-r--r--   1 root     root         1827 Oct 27 18:52 /etc/postfix/sender_access

root: Inode 13503081, i_blocks is 0, should be 8.  FIXED.
 13503081      4 -rwxr-xr-x   1 root     staff          43 Oct 20 13:13 /usr/local/bin/di

root: Inode 13506953, i_blocks is 281474976710653, should be 1. FIXED.
 13506953      4 -rwx------   1 root     root          283 Oct 20 13:32 /usr/local/sbin/mailspamstatus

root: Inode 13514123, i_blocks is 281474976710655, should be 1. FIXED.
 13514123      4 -rwxr-xr-x   1 root     staff          66 Oct 20 13:14 /usr/local/sbin/dmarcdig

root: Inode 20709497, i_blocks is 281474976710642, should be 1. FIXED.
 20709497      4 -rw-r--r--   1 root     root          627 Nov 10 04:25 /root/.wget-hsts

root: Inode 20710430, i_blocks is 281474976710654, should be 1. FIXED.
 20710430      4 -rw-r--r--   1 root     root          273 Nov 10 01:00 /root/relays/relays-all.txt

root: Inode 20710440, i_blocks is 281474976710654, should be 1. FIXED.
 20710440      4 -rw-r--r--   1 root     root          183 Nov 10 01:01 /root/relays/relays-em.txt

root: Inode 20710442, i_blocks is 281474976710654, should be 1. FIXED.
 20710442      4 -rw-r--r--   1 root     root          170 Nov 10 01:01 /root/relays/relays-ak.txt

root: Inode 26738965, i_blocks is 9472, should be 9760.  FIXED.
 26738965   4880 -rw-r--r--   1 root     root      4990910 Nov 10 10:47 /var/lib/smartmontools/attrlog.WDC_WD40EFZX_68AWUN0-WD_WX52D62RAX00.ata.csv

root: Inode 26738966, i_blocks is 9472, should be 9760.  FIXED.
 26738966   4880 -rw-r--r--   1 root     root      4990905 Nov 10 10:47 /var/lib/smartmontools/attrlog.WDC_WD40EFZX_68AWUN0-WD_WX42D52AYR2E.ata.csv

root: Inode 26739233, i_blocks is 281474976710547, should be 4. FIXED.
 26739233     12 -rw-------   1 postfix  postfix     12288 Nov 10 10:42 /var/lib/postfix/smtp_scache.db

root: Inode 26746338, i_blocks is 163264, should be 163312.  FIXED.
 26746338  81656 -rw-------   1 amavis   amavis   83611648 Nov 10 11:01 /var/lib/amavis/.spamassassin/bayes_seen

root: Inode 26746343, i_blocks is 281474976710436, should be 1. FIXED.
 26746343      4 -rw-r--r--   1 amavis   amavis        999 Nov 10 06:38 /var/lib/amavis/.razor/server.c302.cloudmark.com.conf

root: Inode 26746345, i_blocks is 281474976710427, should be 1. FIXED.
 26746345      4 -rw-r--r--   1 amavis   amavis        999 Nov 10 06:04 /var/lib/amavis/.razor/server.c301.cloudmark.com.conf

root: Inode 26746346, i_blocks is 281474976709985, should be 1. FIXED.
 26746346      4 -rw-r--r--   1 amavis   amavis         57 Nov 10 06:38 /var/lib/amavis/.razor/servers.catalogue.lst

root: Inode 26746347, i_blocks is 281474976710429, should be 1. FIXED.
 26746347      4 -rw-r--r--   1 amavis   amavis        999 Nov 10 06:04 /var/lib/amavis/.razor/server.c303.cloudmark.com.conf

root: Inode 26746349, i_blocks is 281474976709981, should be 1. FIXED.
 26746349      4 -rw-r--r--   1 amavis   amavis         76 Nov 10 06:38 /var/lib/amavis/.razor/servers.nomination.lst

root: Inode 27000902, i_blocks is 37344, should be 37384.  FIXED.
 27000902  18692 -rw-r--r--   1 minidlna minidlna 19136512 Nov 10 09:17 /var/cache/minidlna/files.db

root: Inode 27001023, i_blocks is 1768, should be 1808.  FIXED.
 27001023    904 -rw-rw-r--   1 root     utmp       921600 Nov 10 09:17 /var/log/wtmp

root: Inode 27001185, i_blocks is 56, should be 64.  FIXED.
 27001185     32 -rw-r--r--   1 root     root        25612 Oct 31 06:43 /var/log/dpkg.log.1

root: Inode 27001253, i_blocks is 23720, should be 55152.  FIXED.
 27001253  27576 -rw-rw----   1 root     utmp     28232064 Oct 31 23:59 /var/log/btmp.1

root: Inode 27001462, i_blocks is 400, should be 408.  FIXED.
 27001462    204 -rw-r--r--   1 root     root       204239 Nov 10 04:25 /var/log/nuser-blocked

root: Inode 27001550, i_blocks is 112, should be 128.  FIXED.
 27001550     64 -rw-r--r--   1 root     root        58918 Nov 10 00:54 /var/log/samba/log.samba-dcerpcd

root: Inode 27001552, i_blocks is 144, should be 168.  FIXED.
 27001552     84 -rw-r--r--   1 root     root        79569 Nov 10 00:54 /var/log/samba/log.rpcd_winreg

root: Inode 27001579, i_blocks is 5664, should be 7656.  FIXED.
 27001579   3828 -rw-r--r--   1 root     root      3912616 Oct 18 00:00 /var/log/atop/atop_20241017

root: Inode 27001641, i_blocks is 128, should be 160.  FIXED.
 27001641     80 -rw-r--r--   1 root     root        73910 Nov 10 00:54 /var/log/samba/log.rpcd_classic

--
Jesper Dybdal
https://www.dybdal.dk








[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux