XFS File system in trouble

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



	I have a 24T XFS file system that is very sick, and seemingly getting sicker.  I believe it to be the file system itself.  I have replaced the RAID chassis, the OS, the cables, the drive controller, and most of the drives. Re-syncing the 
RAID array complete in a reasonable time, given the size of the array, and reports no mismatches.  Xfs_repair completes, usually with no errors found, or sometimes one or two errors.  Some commands, like a df, are now hanging.  Writes are often failing with I/O errors.  I haven't found any amount of obvious file corruption, but performing a CRC check using md5sum, md6sum, sha256sum, etc., come up with different values every time they are run on many large files.  What can I do to try to rectify this?

Kernel: 3.16.0-4-amd64
Xfsprogs: 3.2.3
8 CPUs
/proc/meminfo:
MemTotal:        8095952 kB
MemFree:         7005032 kB
MemAvailable:    7393072 kB
Buffers:          201804 kB
Cached:           310752 kB
SwapCached:            0 kB
Active:           637704 kB
Inactive:         132232 kB
Active(anon):     258320 kB
Inactive(anon):     3888 kB
Active(file):     379384 kB
Inactive(file):   128344 kB
Unevictable:           0 kB
Mlocked:               4 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                40 kB
Writeback:             0 kB
AnonPages:        257376 kB
Mapped:           121392 kB
Shmem:              4824 kB
Slab:             141708 kB
SReclaimable:      98512 kB
SUnreclaim:        43196 kB
KernelStack:        5072 kB
PageTables:        18832 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     4047976 kB
Committed_AS:    1189596 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      366160 kB
VmallocChunk:   34359349248 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       88660 kB
DirectMap2M:     4003840 kB
DirectMap1G:     4194304 kB

/proc/mounts:
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=10240k,nr_inodes=1001559,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=809596k,mode=755 0 0
/dev/sdd2 / ext4 rw,noatime,errors=remount-ro,data=ordered 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
pstore /sys/fs/pstore pstore rw,relatime 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=1619180k 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/dev/sdd1 /boot ext2 rw,noatime 0 0
tmpfs /var/www/vidmgr/artwork tmpfs rw,relatime,size=16384k 0 0
/dev/md2 /OldDrive ext4 rw,relatime,data=ordered 0 0
rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
Backup:/var/www /var/www/backup nfs rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=49438,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
cgroup /sys/fs/cgroup tmpfs rw,relatime,size=12k 0 0
cgmfs /run/cgmanager/fs tmpfs rw,relatime,size=100k,mode=755 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/x86_64-linux-gnu/systemd-shim-cgroup-release-agent,name=systemd 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=809596k,mode=700 0 0
Backup:/Backup /Backup nfs rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.51,mountvers=3,mountport=57420,mountproto=tcp,local_lock=none,addr=192.168.1.51 0 0
/dev/md0 /RAID xfs rw,relatime,attr2,inode64,sunit=2048,swidth=12288,noquota 0 0

/proc/partitions:
major minor  #blocks  name

   8        0  125034840 sda
   8        1      96256 sda1
   8        2  112305152 sda2
   8        3   12632064 sda3
   8       16  125034840 sdb
   8       17      96256 sdb1
   8       18  112305152 sdb2
   8       19   12632064 sdb3
   8       32 3907018584 sdc
   9        1      96128 md1
   9        3   12623872 md3
   9        2  112239616 md2
  11        0    1048575 sr0
   8       48  488386584 sdd
   8       49      96256 sdd1
   8       50  112305152 sdd2
   8       51   12632064 sdd3
   9        0 23441319936 md0
   8       64 4883770584 sde
   8       80 4883770584 sdf
   8       96 3907018584 sdg
   8      112 4883770584 sdh
   8      128 4883770584 sdi
   8      144 3907018584 sdj
   8      160 3907018584 sdk

mdadm -D /dev/md0:
/dev/md0:
        Version : 1.2
  Creation Time : Fri Oct  3 20:06:55 2014
     Raid Level : raid6
     Array Size : 23441319936 (22355.39 GiB 24003.91 GB)
  Used Dev Size : 3906886656 (3725.90 GiB 4000.65 GB)
   Raid Devices : 8
  Total Devices : 8
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Jul 17 19:47:45 2015
          State : clean 
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 1024K

           Name : RAID-Server:0  (local to host RAID-Server)
           UUID : d26e92db:8bd207bb:db9bec69:4117ed57
         Events : 698300

    Number   Major   Minor   RaidDevice State
      10       8      128        0      active sync   /dev/sdi
      12       8      112        1      active sync   /dev/sdh
       8       8       80        2      active sync   /dev/sdf
       9       8       64        3      active sync   /dev/sde
      11       8       96        4      active sync   /dev/sdg
       5       8       32        5      active sync   /dev/sdc
       6       8      160        6      active sync   /dev/sdk
       7       8      144        7      active sync   /dev/sdj

No LVM

8 SATA disks, various ,manufacturers, 4 & 5T

dmesg is un markable prior to echo w > /proc/sysrq-trigger:
[112915.907065] md: md0: requested-resync done.
[134859.522323] XFS (md0): Mounting V4 Filesystem
[134860.767122] XFS (md0): Ending clean mount
[135019.548703] XFS (md0): Mounting V4 Filesystem
[135019.817854] XFS (md0): Ending clean mount

Xfs_info:
meta-data=/dev/md0               isize=256    agcount=32, agsize=183135488 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=5860329984, imaxpct=5
         =                       sunit=256    swidth=1536 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

After echo w > /proc/sysrq-trigger:
http://fletchergeek.com/images/dmesg.txt

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux