Hello Dave,
My name is Danny Shavit and I am with Zadara storage.
We will appreciate your feedback reagrding an xfs_corruption and xfs_reapir issue.My name is Danny Shavit and I am with Zadara storage.
We found a corrupted xfs volume in one of our systems. It is around 1 TB size and about 12 M files.
We noticed that memory consumption raised to about 7.5 GB.
Since some customers are using only 4GB (and sometimes even 2 GB) we tried running "xfs_repair -m 3200" on a 4GB RAM machine.
However, this time an OOM event happened during handling of AG 26 during step 3.
However, this time an OOM event happened during handling of AG 26 during step 3.
https://zadarastorage-public.s3.amazonaws.com/xfs/xfsdump-prod-ebs_2015-03-30_23-00-38.tgz
--
Thank you,
Danny ShavitZadarastorage
---------- xfs_repair log ----------------
root@vsa-00000428-vc-1:/export/4xfsdump# date; xfs_repair -v /dev/dm-55; date
Tue Mar 31 02:28:04 PDT 2015
Phase 1 - find and verify superblock...
- block cache size set to 735288 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 1920 tail block 1920
- scan filesystem freespace and inode maps...
agi_freecount 54, counted 55 in ag 7
sb_ifree 947, counted 948
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
bad . entry in directory inode 5691013154, was 5691013170: correcting
bad . entry in directory inode 5691013156, was 5691013172: correcting
bad . entry in directory inode 5691013157, was 5691013173: correcting
bad . entry in directory inode 5691013163, was 5691013179: correcting
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26 (Danny: OOM occurred here with -m 3200)
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
Phase 5 - rebuild AG headers and trees...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
entry "SavedXML" in dir inode 2992927241 inconsistent with .. value (4324257659) in ino 5691013156
will clear entry "SavedXML"
rebuilding directory inode 2992927241
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
entry "Out" in dir inode 4324257659 inconsistent with .. value (2992927241) in ino 5691013172
will clear entry "Out"
rebuilding directory inode 4324257659
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
entry "tocs_file" in dir inode 5691012138 inconsistent with .. value (3520464676) in ino 5691013154
will clear entry "tocs_file"
entry "trees.log" in dir inode 5691012138 inconsistent with .. value (3791956240) in ino 5691013155
will clear entry "trees.log"
rebuilding directory inode 5691012138
entry "filelist.xml" in directory inode 5691012139 not consistent with .. value (1909707067) in inode 5691013157,
junking entry
fixing i8count in inode 5691012139
entry "image001.jpg" in directory inode 5691012140 not consistent with .. value (2450176033) in inode 5691013163,
junking entry
fixing i8count in inode 5691012140
entry "OCR" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 1909707065
will clear entry "OCR"
entry "Tmp" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 2179087403
will clear entry "Tmp"
entry "images" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 2450176007
will clear entry "images"
rebuilding directory inode 5691013154
entry "286_Kellman_Hoffer_Master.pdf_files" in dir inode 5691013156 inconsistent with .. value (5691013172) in ino 834535727
will clear entry "286_Kellman_Hoffer_Master.pdf_files"
rebuilding directory inode 5691013156
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- traversal finished ...
- moving disconnected inodes to lost+found ...
disconnected dir inode 834535727, moving to lost+found
disconnected dir inode 1909707065, moving to lost+found
disconnected dir inode 2179087403, moving to lost+found
disconnected dir inode 2450176007, moving to lost+found
disconnected dir inode 5691013154, moving to lost+found
disconnected dir inode 5691013155, moving to lost+found
disconnected dir inode 5691013156, moving to lost+found
disconnected dir inode 5691013157, moving to lost+found
disconnected dir inode 5691013163, moving to lost+found
disconnected dir inode 5691013172, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 81777983 nlinks from 2 to 12
resetting inode 1909210410 nlinks from 1 to 2
resetting inode 1909707067 nlinks from 3 to 2
resetting inode 2450176033 nlinks from 18 to 17
resetting inode 2992927241 nlinks from 13 to 12
resetting inode 3520464676 nlinks from 13 to 12
resetting inode 3791956240 nlinks from 13 to 12
resetting inode 4324257659 nlinks from 13 to 12
resetting inode 5691013154 nlinks from 5 to 2
resetting inode 5691013156 nlinks from 3 to 2
XFS_REPAIR Summary Tue Mar 31 03:11:00 2015
Phase Start End Duration
Phase 1: 03/31 02:28:04 03/31 02:28:05 1 second
Phase 2: 03/31 02:28:05 03/31 02:28:42 37 seconds
Phase 3: 03/31 02:28:42 03/31 02:48:29 19 minutes, 47 seconds
Phase 4: 03/31 02:48:29 03/31 02:55:40 7 minutes, 11 seconds
Phase 5: 03/31 02:55:40 03/31 02:55:43 3 seconds
Phase 6: 03/31 02:55:43 03/31 03:10:57 15 minutes, 14 seconds
Phase 7: 03/31 03:10:57 03/31 03:10:57
Total run time: 42 minutes, 53 seconds
done
Tue Mar 31 03:11:01 PDT 2015
root@vsa-00000428-vc-1:/export/4xfsdump# date; xfs_repair -v /dev/dm-55; date
Tue Mar 31 02:28:04 PDT 2015
Phase 1 - find and verify superblock...
- block cache size set to 735288 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 1920 tail block 1920
- scan filesystem freespace and inode maps...
agi_freecount 54, counted 55 in ag 7
sb_ifree 947, counted 948
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
bad . entry in directory inode 5691013154, was 5691013170: correcting
bad . entry in directory inode 5691013156, was 5691013172: correcting
bad . entry in directory inode 5691013157, was 5691013173: correcting
bad . entry in directory inode 5691013163, was 5691013179: correcting
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26 (Danny: OOM occurred here with -m 3200)
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
Phase 5 - rebuild AG headers and trees...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
entry "SavedXML" in dir inode 2992927241 inconsistent with .. value (4324257659) in ino 5691013156
will clear entry "SavedXML"
rebuilding directory inode 2992927241
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
entry "Out" in dir inode 4324257659 inconsistent with .. value (2992927241) in ino 5691013172
will clear entry "Out"
rebuilding directory inode 4324257659
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
entry "tocs_file" in dir inode 5691012138 inconsistent with .. value (3520464676) in ino 5691013154
will clear entry "tocs_file"
entry "trees.log" in dir inode 5691012138 inconsistent with .. value (3791956240) in ino 5691013155
will clear entry "trees.log"
rebuilding directory inode 5691012138
entry "filelist.xml" in directory inode 5691012139 not consistent with .. value (1909707067) in inode 5691013157,
junking entry
fixing i8count in inode 5691012139
entry "image001.jpg" in directory inode 5691012140 not consistent with .. value (2450176033) in inode 5691013163,
junking entry
fixing i8count in inode 5691012140
entry "OCR" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 1909707065
will clear entry "OCR"
entry "Tmp" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 2179087403
will clear entry "Tmp"
entry "images" in dir inode 5691013154 inconsistent with .. value (5691013170) in ino 2450176007
will clear entry "images"
rebuilding directory inode 5691013154
entry "286_Kellman_Hoffer_Master.pdf_files" in dir inode 5691013156 inconsistent with .. value (5691013172) in ino 834535727
will clear entry "286_Kellman_Hoffer_Master.pdf_files"
rebuilding directory inode 5691013156
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- traversal finished ...
- moving disconnected inodes to lost+found ...
disconnected dir inode 834535727, moving to lost+found
disconnected dir inode 1909707065, moving to lost+found
disconnected dir inode 2179087403, moving to lost+found
disconnected dir inode 2450176007, moving to lost+found
disconnected dir inode 5691013154, moving to lost+found
disconnected dir inode 5691013155, moving to lost+found
disconnected dir inode 5691013156, moving to lost+found
disconnected dir inode 5691013157, moving to lost+found
disconnected dir inode 5691013163, moving to lost+found
disconnected dir inode 5691013172, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 81777983 nlinks from 2 to 12
resetting inode 1909210410 nlinks from 1 to 2
resetting inode 1909707067 nlinks from 3 to 2
resetting inode 2450176033 nlinks from 18 to 17
resetting inode 2992927241 nlinks from 13 to 12
resetting inode 3520464676 nlinks from 13 to 12
resetting inode 3791956240 nlinks from 13 to 12
resetting inode 4324257659 nlinks from 13 to 12
resetting inode 5691013154 nlinks from 5 to 2
resetting inode 5691013156 nlinks from 3 to 2
XFS_REPAIR Summary Tue Mar 31 03:11:00 2015
Phase Start End Duration
Phase 1: 03/31 02:28:04 03/31 02:28:05 1 second
Phase 2: 03/31 02:28:05 03/31 02:28:42 37 seconds
Phase 3: 03/31 02:28:42 03/31 02:48:29 19 minutes, 47 seconds
Phase 4: 03/31 02:48:29 03/31 02:55:40 7 minutes, 11 seconds
Phase 5: 03/31 02:55:40 03/31 02:55:43 3 seconds
Phase 6: 03/31 02:55:43 03/31 03:10:57 15 minutes, 14 seconds
Phase 7: 03/31 03:10:57 03/31 03:10:57
Total run time: 42 minutes, 53 seconds
done
Tue Mar 31 03:11:01 PDT 2015
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs