I have tried to run xfs_repair, but it got stucked on the following line for hours : freeblk count 1 != flcount 1292712394 in ag 15 Any idea why ? Below is a more comprehensive excerpt: Tanker:~# xfs_repair -v -L /dev/md0 Phase 1 - find and verify superblock... - block cache size set to 166416 entries Phase 2 - using internal log - zero log... zero_log: head block 8 tail block 8 - scan filesystem freespace and inode maps... bad magic # 0xd8800000 in inobt block 0/424 expected level 1 got 55424 in inobt block 0/424 bad on-disk superblock 1 - bad magic number primary/secondary superblock 1 conflict - AG superblock geometry info conflicts with filesystem geometry bad magic # 0x4a6e005b for agf 1 bad version # -802481226 for agf 1 bad sequence # -1117236046 for agf 1 bad length 746456958 for agf 1, should be 15261872 flfirst -217756885 in agf 1 too large (max = 1024) fllast -1858828195 in agf 1 too large (max = 1024) bad magic # 0x2d98eeb7 for agi 1 bad version # 1487820178 for agi 1 bad sequence # -986498475 for agi 1 bad length # -640410626 for agi 1, should be 15261872 reset bad sb for ag 1 reset bad agf for ag 1 reset bad agi for ag 1 bad agbno 1166984667 in agfl, agno 1 freeblk count 1 != flcount 851499664 in ag 1 bad agbno 3740209675 for btbno root, agno 1 bad agbno 3584687323 for btbcnt root, agno 1 bad agbno 2431368798 for inobt root, agno 1 bad magic # 0x43f5a1a0 in btbno block 2/3274820 expected level 0 got 11696 in btbno block 2/3274820 bad magic # 0x67e3e2b in btcnt block 2/7285316 expected level 0 got 38919 in btcnt block 2/7285316 bad magic # 0x7bd8ba0c in inobt block 2/2893023 expected level 0 got 58199 in inobt block 2/2893023 dubious inode btree block header 2/2893023 badly aligned inode rec (starting inode = 1893464917) bad starting inode # (1893464917 (0x2 0x70dbfb55)) in ino rec, skipping rec // snip badly aligned inode rec (starting inode = 2958692015) bad starting inode # (2958692015 (0x2 0xb05a0eaf)) in ino rec, skipping rec bad magic # 0x30317762 in inobt block 3/1386762 expected level 0 got 4096 in inobt block 3/1386762 dubious inode btree block header 3/1386762 bad on-disk superblock 4 - bad magic number primary/secondary superblock 4 conflict - AG superblock geometry info conflicts with filesystem geometry bad magic # 0xb8273064 for agf 4 bad version # 608255319 for agf 4 bad sequence # -1133901349 for agf 4 bad length -1010053909 for agf 4, should be 15261872 flfirst -1120948486 in agf 4 too large (max = 1024) fllast -249192277 in agf 4 too large (max = 1024) bad magic # 0x3e1aa5d1 for agi 4 bad version # 78211236 for agi 4 bad sequence # -95059905 for agi 4 bad length # -2068843361 for agi 4, should be 15261872 reset bad sb for ag 4 reset bad agf for ag 4 reset bad agi for ag 4 bad agbno 84248051 in agfl, agno 4 freeblk count 1 != flcount -1877584494 in ag 4 bad agbno 3295638635 for btbno root, agno 4 bad agbno 3824887919 for btbcnt root, agno 4 bad agbno 2584592551 for inobt root, agno 4 bad magic # 0xcb8a0838 in btbno block 5/15107725 expected level 1 got 62865 in btbno block 5/15107725 bad magic # 0x720a4ca5 in btcnt block 5/5781135 expected level 1 got 12940 in btcnt block 5/5781135 bad magic # 0x14c90900 in btbno block 6/8534085 expected level 0 got 8351 in btbno block 6/8534085 bad magic # 0 in btcnt block 6/8534087 bad magic # 0xb2fb97d4 in inobt block 6/8537277 expected level 0 got 28305 in inobt block 6/8537277 dubious inode btree block header 6/8537277 badly aligned inode rec (starting inode = 3916413448) bad starting inode # (3916413448 (0x6 0xa96fba08)) in ino rec, skipping rec // snip badly aligned inode rec (starting inode = 3991836463) bad starting inode # (3991836463 (0x6 0xcdee972f)) in ino rec, skipping rec bad magic # 0xc1c0e0db in btbno block 7/14299066 expected level 0 got 56022 in btbno block 7/14299066 bad magic # 0 in btcnt block 7/131534 bad on-disk superblock 8 - bad magic number primary/secondary superblock 8 conflict - AG superblock geometry info conflicts with filesystem geometry bad magic # 0x0 for agf 8 bad version # 0 for agf 8 bad sequence # 0 for agf 8 bad length 0 for agf 8, should be 15261872 flfirst 1235838528 in agf 8 too large (max = 1024) fllast -1624582223 in agf 8 too large (max = 1024) bad magic # 0xd27ee25b for agi 8 bad version # 682133911 for agi 8 bad sequence # 868833830 for agi 8 bad length # -309198643 for agi 8, should be 15261872 reset bad sb for ag 8 reset bad agf for ag 8 reset bad agi for ag 8 bad agbno 288703486 in agfl, agno 8 freeblk count 1 != flcount -2096152000 in ag 8 bad agbno 0 for btbno root, agno 8 bad agbno 0 for btbcnt root, agno 8 bad agbno 3529020399 for inobt root, agno 8 bad magic # 0 in btbno block 9/13230958 expected level 1 got 0 in btbno block 9/13230958 bad magic # 0x9545b5da in inobt block 10/6063213 expected level 0 got 9464 in inobt block 10/6063213 dubious inode btree block header 10/6063213 badly aligned inode rec (starting inode = 2899097214) bad starting inode # (2899097214 (0xa 0x8cccb67e)) in ino rec, skipping rec // snip bad starting inode # (3943235510 (0xa 0xcb08ffb6)) in ino rec, skipping rec badly aligned inode rec (starting inode = 2842887821) ir_freecount/free mismatch, inode chunk 10/158533261, freecount -564015419 nfree 24 badly aligned inode rec (starting inode = 2769797306) bad starting inode # (2769797306 (0xa 0x2517c0ba)) in ino rec, skipping rec badly aligned inode rec (starting inode = 2753872950) ir_freecount/free mismatch, inode chunk 10/69518390, freecount 305136831 nfree 42 // snip bad starting inode # (3912402532 (0xa 0x69328664)) in ino rec, skipping rec bad on-disk superblock 11 - bad magic number primary/secondary superblock 11 conflict - AG superblock geometry info conflicts with filesystem geometry bad magic # 0xa558fbb3 for agf 11 bad version # -147377049 for agf 11 bad sequence # -762960785 for agf 11 bad length 1134727438 for agf 11, should be 15261872 flfirst 1200930101 in agf 11 too large (max = 1024) fllast 582547970 in agf 11 too large (max = 1024) bad magic # 0xfdb94fd6 for agi 11 bad version # 1570507841 for agi 11 bad sequence # -22538393 for agi 11 bad length # -454538688 for agi 11, should be 15261872 reset bad sb for ag 11 reset bad agf for ag 11 reset bad agi for ag 11 bad agbno 493291493 in agfl, agno 11 freeblk count 1 != flcount 1190954728 in ag 11 bad agbno 1979771594 for btbno root, agno 11 bad agbno 3493454650 for btbcnt root, agno 11 bad agbno 1378244542 for inobt root, agno 11 bad magic # 0x1010101 in btbno block 12/4811500 expected level 0 got 257 in btbno block 12/4811500 bad magic # 0 in btbno block 12/4811502 bno freespace btree block claimed (state 2), agno 12, bno 4811502, suspect 0 bad magic # 0x242591f3 in btcnt block 12/11519022 expected level 0 got 1773 in btcnt block 12/11519022 block (13,2522619) multiply claimed by bno space tree, state - 1 // snip block (13,13871227) multiply claimed by bno space tree, state - 1 bad magic # 0x1bdd0e58 in btcnt block 13/1753645 expected level 0 got 37606 in btcnt block 13/1753645 bad magic # 0x41425443 in btbno block 14/12342653 block (14,90288) multiply claimed by bno space tree, state - 1 // snip block (14,106654) multiply claimed by bno space tree, state - 1 bad magic # 0xc68e7699 in btcnt block 14/53183 expected level 0 got 37689 in btcnt block 14/53183 bad magic # 0 in btcnt block 14/1372 bad on-disk superblock 15 - bad magic number primary/secondary superblock 15 conflict - AG superblock geometry info conflicts with filesystem geometry bad magic # 0x494e81ff for agf 15 bad version # 33685504 for agf 15 bad sequence # 0 for agf 15 bad length 0 for agf 15, should be 15261872 flfirst 1237397194 in agf 15 too large (max = 1024) fllast 211534305 in agf 15 too large (max = 1024) bad magic # 0x494e81ff for agi 15 bad version # 33685504 for agi 15 bad sequence # 0 for agi 15 bad length # 0 for agi 15, should be 15261872 reset bad sb for ag 15 reset bad agf for ag 15 reset bad agi for ag 15 bad agbno 1229865471 in agfl, agno 15 freeblk count 1 != flcount 1292712394 in ag 15 -----Message d'origine----- De : linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] De la part de Philippe PIOLAT Envoyé : mardi 4 janvier 2011 12:36 À : 'NeilBrown' Cc : linux-raid@xxxxxxxxxxxxxxx Objet : RE: fd partitions gone from 2 discs, md happy with it and reconstructs... bye bye datas Thanks a lot for your help. I think I sunk deeper meanwhile... I could recover the /dev/sdg1 and /dev/sdh1 entries using partprobe. I then zeroed the superblocks from all disks, recreated the array with --assume-clean and started the resync. Then I received your answer and as you advised searched from older logs and discovered... that I made the mistake to add sdg and sdh to the array at last upgrade, and not sdg1 and sdh1 as I believed I did!... So I quicly stopped the sync (was just started a few minutes ago), killed the sdg1 and sdh1 partitions, zeroed the superblocks and assemblied again with --assume-clean. As of now, it's syncing again... Its probably hopeless now isn't it?.... :-( Phil. -----Message d'origine----- De : linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] De la part de NeilBrown Envoyé : mardi 4 janvier 2011 12:04 À : Philippe PIOLAT Cc : linux-raid@xxxxxxxxxxxxxxx Objet : Re: fd partitions gone from 2 discs, md happy with it and reconstructs... bye bye datas On Tue, 4 Jan 2011 10:11:10 +0100 "Philippe PIOLAT" <piolat@xxxxxxxxxxxx> wrote: > Hey gurus, need some help badly with this one. > I run a server with a 6Tb md raid5 volume built over 7*1Tb disks. > I've had to shut down the server lately and when it went back up, 2 > out of the 7 disks used for the raid volume had lost its conf : I should say up front that I suspect you have lost your data. However there is enough here that doesn't make sense that I cannot be certain of anything. > > dmesg : > [ 10.184167] sda: sda1 sda2 sda3 // System disk > [ 10.202072] sdb: sdb1 > [ 10.210073] sdc: sdc1 > [ 10.222073] sdd: sdd1 > [ 10.229330] sde: sde1 > [ 10.239449] sdf: sdf1 > [ 11.099896] sdg: unknown partition table > [ 11.255641] sdh: unknown partition table If sdg and sdh had a partition table before, but don't now, then at least the first block of each of those devices has been corrupted. In that case we must assume that an unknown number of blocks at the start of those drives has been corrupted. In that case you could have already lost critical data and this point and nothing you could have done would have helped. > > All 7 disks have same geometry and were configured alike : > > dmesg : > Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0x1e7481a5 > > Device Boot Start End Blocks Id System > /dev/sdb1 1 121601 976760001 fd Linux raid > autodetect So the partition started 16065 sectors from the start of the device. This is not a multiple of 64K, which is good. If a partition starts at a multiple of 64K from the start of the device and extends to the end of the device, then the md metadata on the partition could look like it was on the disk as well. When mdadm sees a situation like this it will complain, but that cannot have been happening to you. So when the partition table was destroy, mdadm should not have been able to see the metadata that belonged to the partition. > > All 7 disks (sdb1, sdc1, sdd1, sde1, sdf1, sdg1, sdh1) were used in a > md > raid5 xfs volume. > When booting, md, which was (obviously) out of sync kicked in and > automatically started rebuilding over the 7 disks, including the two > "faulty" ones; xfs tried to do some shenanigans as well: > > dmesg : > [ 19.566941] md: md0 stopped. > [ 19.817038] md: bind<sdc1> > [ 19.817339] md: bind<sdd1> > [ 19.817465] md: bind<sde1> > [ 19.817739] md: bind<sdf1> > [ 19.817917] md: bind<sdh> > [ 19.818079] md: bind<sdg> > [ 19.818198] md: bind<sdb1> > [ 19.818248] md: md0: raid array is not clean -- starting background > reconstruction > [ 19.825259] raid5: device sdb1 operational as raid disk 0 > [ 19.825261] raid5: device sdg operational as raid disk 6 > [ 19.825262] raid5: device sdh operational as raid disk 5 > [ 19.825264] raid5: device sdf1 operational as raid disk 4 > [ 19.825265] raid5: device sde1 operational as raid disk 3 > [ 19.825267] raid5: device sdd1 operational as raid disk 2 > [ 19.825268] raid5: device sdc1 operational as raid disk 1 > [ 19.825665] raid5: allocated 7334kB for md0 > [ 19.825667] raid5: raid level 5 set md0 active with 7 out of 7 devices, > algorithm 2 ... however it is clear that mdadm (and md) saw metadata at the end of the device which exactly matched the metadata on the other devices in the array. This is very hard to explain. I can only think of there explanations, none of which seem particularly likely 1/ The partition table on sdg and sdh actually placed the first partition at a multiple of 64K unlike all the other devices in the array. 2/ someone copied the superblock from the end of sdg1 to the end of sdg, and also for sdh1 to sdh. Given that the first block of both devices was changed too, a command like: dd if=/dev/sdg1 of=/dev/sdg would have done it. But that seems extremely unlikely. 3/ The array previously consisted of 5 partitions and 2 whole devices. I have certainly seen this happen before, usually by accident. But if this were the case, your data should all be intact. Yet it isn't. > [ 19.825669] RAID5 conf printout: > [ 19.825670] --- rd:7 wd:7 > [ 19.825671] disk 0, o:1, dev:sdb1 > [ 19.825672] disk 1, o:1, dev:sdc1 > [ 19.825673] disk 2, o:1, dev:sdd1 > [ 19.825675] disk 3, o:1, dev:sde1 > [ 19.825676] disk 4, o:1, dev:sdf1 > [ 19.825677] disk 5, o:1, dev:sdh > [ 19.825679] disk 6, o:1, dev:sdg > [ 19.899787] PM: Starting manual resume from disk > [ 28.663228] Filesystem "md0": Disabling barriers, not supported by the > underlying device > [ 28.663228] XFS mounting filesystem md0 > [ 28.884433] md: resync of RAID array md0 > [ 28.884433] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [ 28.884433] md: using maximum available idle IO bandwidth (but not more > than 200000 KB/sec) for resync. > [ 28.884433] md: using 128k window, over a total of 976759936 blocks. This resync is why I think your data could well be lost. If the metadata did somehow get relocated, but the data didn't, then this will have updated all of the blocks that were thought to be parity blocks. All of those on sdh and sdh would almost certainly have been data blocks, and that data would now be gone. But there are still some big 'if's in there. > [ 29.025980] Starting XFS recovery on filesystem: md0 (logdev: internal) > [ 32.680486] XFS: xlog_recover_process_data: bad clientid > [ 32.680495] XFS: log mount/recovery failed: error 5 > [ 32.682773] XFS: log mount failed > > I ran fdisk and flagged sdg1 and sdh1 as fd. If, however, the md metadata had not been moved, and the array was previously made of 5 partitions and two devices, then this action would have corrupted some data early in the array possible making it impossible to recover the xfs filesystem (not that it looked like it was particularly recoverable anyway). > I tried to reassemble the array but it didnt work: no matter what was > in mdadm.conf, it still uses sdg and sdh instead of sdg1 and sdh1. This seems to confirm that the metadata that we thought was on sdg1 and sdh1 wasn't. Using "mdadm --examine /dev/sdg1" for example would confirm. > I checked in /dev and I see no sdg1 and and sdh1, shich explains why > it wont use it. mdadm -S /dev/md0 block --rereadpt /dev/sdg /dev/sdh should fix that. > I just don't know why those partitions are gone from /dev and how to > readd those... > > blkid : > /dev/sda1: LABEL="boot" UUID="519790ae-32fe-4c15-a7f6-f1bea8139409" > TYPE="ext2" > /dev/sda2: TYPE="swap" > /dev/sda3: LABEL="root" UUID="91390d23-ed31-4af0-917e-e599457f6155" > TYPE="ext3" > /dev/sdb1: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sdc1: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sdd1: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sde1: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sdf1: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sdg: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > /dev/sdh: UUID="2802e68a-dd11-c519-e8af-0d8f4ed72889" TYPE="mdraid" > > fdisk -l : > Disk /dev/sda: 40.0 GB, 40020664320 bytes > 255 heads, 63 sectors/track, 4865 cylinders Units = cylinders of 16065 > * 512 = 8225280 bytes Disk identifier: 0x8c878c87 > > Device Boot Start End Blocks Id System > /dev/sda1 * 1 12 96358+ 83 Linux > /dev/sda2 13 134 979965 82 Linux swap / Solaris > /dev/sda3 135 4865 38001757+ 83 Linux > > Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0x1e7481a5 > > Device Boot Start End Blocks Id System > /dev/sdb1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0xc9bdc1e9 > > Device Boot Start End Blocks Id System > /dev/sdc1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0xcc356c30 > > Device Boot Start End Blocks Id System > /dev/sdd1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sde: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0xe87f7a3d > > Device Boot Start End Blocks Id System > /dev/sde1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sdf: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0xb17a2d22 > > Device Boot Start End Blocks Id System > /dev/sdf1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0x8f3bce61 > > Device Boot Start End Blocks Id System > /dev/sdg1 1 121601 976760001 fd Linux raid > autodetect > > Disk /dev/sdh: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of > 16065 * 512 = 8225280 bytes Disk identifier: 0xa98062ce > > Device Boot Start End Blocks Id System > /dev/sdh1 1 121601 976760001 fd Linux raid > autodetect > > I really dont know what happened nor how to recover from this mess. > Needless to say the 5TB or so worth of data sitting on those disks are > very valuable to me... > > Any idea any one? > Did anybody ever experienced a similar situation or know how to > recover from it ? > > Can someone help me? I'm really desperate... :x I would see if your /var/logs file go back to the last reboot of this system and see if they show how the array was assembled then. If they do, then collect any message about md or raid from that time until now. That might give some hints as to what happened, but I don't hold a lot of hope that it will allow your data to be recovered. NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html