Re: "Internal error xfs_attr3_leaf_write_verify at line 216", "directory flags set on non-directory inode" and other errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07 Jul 2015, at 02:19, Dave Chinner <david@xxxxxxxxxxxxx> wrote:

On Mon, Jul 06, 2015 at 01:08:52PM +0200, Rasmus Borup Hansen wrote:
I've made a metadump and I'm running another xfs_repair, but given
that the first metadump is 132 GB, will you still be interested in
looking at the dumps?

That's significantly larger than my monthly download quota.  How big
is it once you compress it?

One metadump is 25 GB when compressed with xz -9. The server the files currently reside on is not very fast, so I've only compressed one of them so far.

I used the strings command on the metadump files and discovered that they contain fragments of files that we really don't want to leave our IT systems. However, if you think it's worth the effort, I could set up a virtual machine with the metadump files and give you access with your SSH public key. But then you'll have to tell me which tools you'll need for investigating the files.

Also, because of the size of the metadump, I'll need some context
about the hardware it is running on. Can you you please also provide
the information in:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

so I have a better idea of environment the problem is showing up in.

$ uname -a
Linux mammuthus 3.13.0-55-generic #94-Ubuntu SMP Thu Jun 18 00:27:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ ./xfs_repair -V
xfs_repair version 3.2.3

$ cat /proc/cpuinfo | grep processor | wc -l
2

$ cat /proc/meminfo
MemTotal:       10228560 kB
MemFree:          300612 kB
Buffers:          111996 kB
Cached:          3569448 kB
SwapCached:         7836 kB
Active:          1915848 kB
Inactive:        2510244 kB
Active(anon):     358064 kB
Inactive(anon):   386784 kB
Active(file):    1557784 kB
Inactive(file):  2123460 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       2088956 kB
SwapFree:        2042964 kB
Dirty:              3972 kB
Writeback:             0 kB
AnonPages:        739808 kB
Mapped:            18340 kB
Shmem:                72 kB
Slab:            4672980 kB
SReclaimable:    3927540 kB
SUnreclaim:       745440 kB
KernelStack:        2328 kB
PageTables:         8840 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     7203236 kB
Committed_AS:    1152936 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      309048 kB
VmallocChunk:   34359412736 kB
HardwareCorrupted:     0 kB
AnonHugePages:      8192 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       62052 kB
DirectMap2M:    10414080 kB

$ cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=5103100k,nr_inodes=1275775,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=1022856k,mode=755 0 0
/dev/mapper/mammuthus-root / ext4 rw,relatime,errors=remount-ro,data="" 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
none /sys/fs/pstore pstore rw,relatime 0 0
/dev/sdd1 /boot ext2 rw,relatime 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,name=systemd 0 0
/dev/mapper/backup02-limited_backup /limited_backup xfs rw,noatime,attr2,inode64,logbsize=256k,noquota 0 0
/dev/mapper/backup02-timemachine /timemachine_backup ext4 rw,relatime,errors=remount-ro,data="" 0 0
/dev/mapper/backup01-data /backup xfs rw,noatime,attr2,inode64,logbsize=256k,noquota 0 0

This is what /proc/mounts looks like currently. When the error first occurred there was no /dev/mapper/backup02 volume group and /dev/mapper/backup01-data (which has the file system that behaves strangely) was mounted with user and project quota. Of course, the file system was not mounted when running xfs_repair.

$ cat /proc/partitions
major minor  #blocks  name

   8       16 19529728000 sdb
   8       17 19529726959 sdb1
   8        0 39064698880 sda
   8        1 39064697839 sda1
 252        0 39064694784 dm-0
   8       32  976224256 sdc
   8       33  976223215 sdc1
   8       48  155713536 sdd
   8       49     248832 sdd1
   8       50          1 sdd2
   8       53  155461632 sdd5
 252        1  153370624 dm-1
 252        2    2088960 dm-2
 252        3   33554432 dm-3
 252        4  942665728 dm-4
 252        5 5368709120 dm-5
 252        6 14161014784 dm-6

I'm using hardware RAID level 6 with a Dell PERC H800 controller and an MD1200 disk enclosure with 12 4 TB disks configured as a single "virtual disk":

$ /opt/dell/srvadmin/bin/omreport storage vdisk controller=0
List of Virtual Disks on Controller PERC H800 Adapter (Slot 1)

Controller PERC H800 Adapter (Slot 1)
ID                                : 0
Status                            : Ok
Name                              : mammuthus01
State                             : Ready
Hot Spare Policy violated         : Not Assigned
Encrypted                         : No
Layout                            : RAID-6
Size                              : 37,255.00 GB (40002251653120 bytes)
T10 Protection Information Status : No
Associated Fluid Cache State      : Not Applicable
Device Name                       : /dev/sda
Bus Protocol                      : SAS
Media                             : HDD
Read Policy                       : Adaptive Read Ahead
Write Policy                      : Write Back
Cache Policy                      : Not Applicable
Stripe Element Size               : 64 KB
Disk Cache Policy                 : Disabled

This "virtual disk" is the only member of the volume group "backup01":

$ sudo pvscan
  PV /dev/sdd5   VG mammuthus   lvm2 [148.26 GiB / 0    free]
  PV /dev/sdc1   VG extra       lvm2 [931.00 GiB / 0    free]
  PV /dev/sdb1   VG backup02    lvm2 [18.19 TiB / 0    free]
  PV /dev/sda1   VG backup01    lvm2 [36.38 TiB / 0    free]
  Total: 4 [55.62 TiB] / in use: 4 [55.62 TiB] / in no VG: 0 [0   ]

This volume group has a single logical volume:

$ sudo lvscan
  ACTIVE            '/dev/mammuthus/root' [146.27 GiB] inherit
  ACTIVE            '/dev/mammuthus/swap_1' [1.99 GiB] inherit
  ACTIVE            '/dev/extra/swap' [32.00 GiB] inherit
  ACTIVE            '/dev/extra/files' [899.00 GiB] inherit
  ACTIVE            '/dev/backup02/timemachine' [5.00 TiB] inherit
  ACTIVE            '/dev/backup02/limited_backup' [13.19 TiB] inherit
  ACTIVE            '/dev/backup01/data' [36.38 TiB] inherit

The drives are 12 4 TB 7.2 RPK Near-Line SAS 3.5" hot plug drives:

$ /opt/dell/srvadmin/bin/omreport storage pdisk controller=0 connector=0
List of Physical Disks on Connector 0 of Controller PERC H800 Adapter (Slot 1)

Controller PERC H800 Adapter (Slot 1)
ID                              : 0:0:0
Status                          : Ok
Name                            : Physical Disk 0:0:0
State                           : Online
Power Status                    : Spun Up
Bus Protocol                    : SAS
Media                           : HDD
Part of Cache Pool              : Not Applicable
Remaining Rated Write Endurance : Not Applicable
Failure Predicted               : No
Revision                        : GS0D
Driver Version                  : Not Applicable
Model Number                    : Not Applicable
T10 PI Capable                  : No
Certified                       : Yes
Encryption Capable              : No
Encrypted                       : Not Applicable
Progress                        : Not Applicable
Mirror Set ID                   : Not Applicable
Capacity                        : 3,725.50 GB (4000225165312 bytes)
Used RAID Disk Space            : 3,725.50 GB (4000225165312 bytes)
Available RAID Disk Space       : 0.00 GB (0 bytes)
Hot Spare                       : No
Vendor ID                       : DELL(tm)
Product ID                      : ST4000NM0023
Serial No.                      : Z1Z4BNJ3
Part Number                     : TH0529FG2123345Q022HA02
Negotiated Speed                : 6.00 Gbps
Capable Speed                   : 6.00 Gbps
PCIe Maximum Link Width         : Not Applicable
PCIe Negotiated Link Width      : Not Applicable
Sector Size                     : 512B
Device Write Cache              : Not Applicable
Manufacture Day                 : 04
Manufacture Week                : 21
Manufacture Year                : 2014
SAS Address                     : 5000C50058C0F211

(Only output for the first drive show; the others are similar.)

The individual drives don't use write caches, but the storage controller has 512 MB cache with battery backup operating in write-back mode.

$ xfs_info /backup/
meta-data="" isize=256    agcount=37, agsize=268435455 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=9766173696, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Output from ls when listing "lost+found":

$ ls -laF /backup/lost+found/
ls: /backup/lost+found/11539619467: Structure needs cleaning
total 4
drwxr-xr-x 2 root root     32 Jun 30 07:43 ./
drwxr-xr-x 5 root root     74 Jul  2 12:55 ../
-rw-rw-rw- 1 tsj  intomics  0 Jun 23 16:11 11539619467

Relevant output from dmesg (the errors are generated by the ls command above):

[444852.252110] XFS (dm-0): Mounting Filesystem
[444854.630181] XFS (dm-0): Ending clean mount
[503166.397439] ffff880114063000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00 00  ................
[503166.425056] ffff880114063010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00 00  ..... ..........
[503166.453484] ffff880114063020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[503166.480812] ffff880114063030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[503166.508386] XFS (dm-0): Internal error xfs_attr3_leaf_read_verify at line 246 of file /build/buildd/linux-3.13.0/fs/xfs/xfs_attr_leaf.c.  Caller 0xffffffffa00d4885
[503166.562354] CPU: 1 PID: 3342 Comm: kworker/1:1H Not tainted 3.13.0-55-generic #94-Ubuntu
[503166.562356] Hardware name: Dell Inc. PowerEdge R310/05XKKK, BIOS 1.8.2 08/17/2011
[503166.562394] Workqueue: xfslogd xfs_buf_iodone_work [xfs]
[503166.562398]  0000000000000001 ffff8802af92bd68 ffffffff81723294 ffff88004d1b2000
[503166.562402]  ffff8802af92bd80 ffffffffa00d76fb ffffffffa00d4885 ffff8802af92bdb8
[503166.562403]  ffffffffa00d7755 000000f600200400 ffff88001b216a00 ffff88004d1b2000
[503166.562406] Call Trace:
[503166.562415]  [<ffffffff81723294>] dump_stack+0x45/0x56
[503166.562426]  [<ffffffffa00d76fb>] xfs_error_report+0x3b/0x40 [xfs]
[503166.562436]  [<ffffffffa00d4885>] ? xfs_buf_iodone_work+0x85/0xf0 [xfs]
[503166.562446]  [<ffffffffa00d7755>] xfs_corruption_error+0x55/0x80 [xfs]
[503166.562459]  [<ffffffffa00f4bdd>] xfs_attr3_leaf_read_verify+0x6d/0xf0 [xfs]
[503166.562469]  [<ffffffffa00d4885>] ? xfs_buf_iodone_work+0x85/0xf0 [xfs]
[503166.562479]  [<ffffffffa00d4885>] xfs_buf_iodone_work+0x85/0xf0 [xfs]
[503166.562483]  [<ffffffff81083b22>] process_one_work+0x182/0x450
[503166.562485]  [<ffffffff81084911>] worker_thread+0x121/0x410
[503166.562487]  [<ffffffff810847f0>] ? rescuer_thread+0x430/0x430
[503166.562489]  [<ffffffff8108b702>] kthread+0xd2/0xf0
[503166.562491]  [<ffffffff8108b630>] ? kthread_create_on_node+0x1c0/0x1c0
[503166.562494]  [<ffffffff81733ca8>] ret_from_fork+0x58/0x90
[503166.562496]  [<ffffffff8108b630>] ? kthread_create_on_node+0x1c0/0x1c0
[503166.562498] XFS (dm-0): Corruption detected. Unmount and run xfs_repair
[503166.589297] XFS (dm-0): metadata I/O error: block 0x157e84da0 ("xfs_trans_read_buf_map") error 117 numblks 8

The error occurs even though the file system is not doing anything else.

Intomics is a contract research organization specialized in deriving core biological insight from large scale data. We help our clients in the pharmaceutical industry develop tomorrow's medicines better, faster, and cheaper through optimized use of biomedical data.
-----------------------------------------------------------------
Hansen, Rasmus Borup              Intomics - from data to biology
System Administrator              Diplomvej 377
Scientific Programmer             DK-2800 Kgs. Lyngby
                                  Denmark
E: rbh@xxxxxxxxxxxx               W: http://www.intomics.com/
P: +45 5167 7972                  P: +45 8880 7979

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux