Re: xfs corrupted

Stefanita Rares Dumitrescu <katmai@xxxxxxxxxxxxxxx> · Tue, 15 Oct 2013 20:45:59 +0200

That was the first thing i checked: the array was optimal, and i checked 
each drive with smartctl, and they are all fine.

I left the xfs_repair on for the night, and it showed no progress. I was 
actually thinking that maybe the memory is bad, so i took the server 
offline this morning, and ran a memtest for 3 hours, which showed 
nothing wrong with the sticks, however good news:

I was able to mount the array, but i can only read from it. Whenever i 
try to write something, it just hangs right there.

I ran an xfs_repair -n on the second array, which is 18 tb in size as 
opposed to the 14 tb first one, and that check completed in like 10 
minutes.

I am running now xfs_repair -n on the 14 tb bad array, and it's stuck 
here for about 5 hours now.

[root@kp4 ~]# umount /home
[root@kp4 ~]# xfs_repair -n /dev/sdc
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0

What worries me is that i see 100 % cpu usage, some 74 % memory usage (i 
have 4 gb ram) but there is no disk activity at all. I was thinking that 
it would be at least some reads if the xfs_repair is doing something.

On 15/10/2013 20:34, Emmanuel Florac wrote:
Le Tue, 15 Oct 2013 01:41:47 -0700 (PDT) vous écriviez:

Did i jump the gun by using the -L switch :/ ?

You should have checked that the RAID is optimal first! In case of a
flailing hardware, any write to the volume can exacerbate problems.

You should use arcconf to check for the RAID state (arcconf getstatus
1)  and eventually run a RAID repair (arcconf task start 1 logicaldrive
0 verify_fix).

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs