Hello Phil, thank you BTW for your continued assistance. Here goes: On Sat, Sep 10, 2022 at 11:18 AM Phil Turmel <philip@xxxxxxxxxx> wrote: > Yes. Same kernels are pretty repeatable for device order on bootup as > long as all are present. Anything missing will shift the letter > assignments. We need to keep this in mind, though the described boot log scsi target -> letter assignment seem to indicate that we're clear as discussed. This is relevant since I have re--created the array. > Okay, that should have saved you. Except, I think it still writes all > the meta-data. With v1.2, that would sparsely trash up to 1/4 gig at > tbe beginning of each device. I dug into the docs and the wiki and ran some experiments on another machine. Apparently, what 1.2 does with my kernel and my mdadm is use sectors 9 to 80 of each device. Thus, it borked 72 512-byte sectors -> 36 kB -> 9 ext3 blocks per device, sparsely as you say. This is 'fine' even with a 128kB chunk, the first one doesn't really matter because yes, fsck detects that it nuked the block group descriptors but the superblock before them is fine (indeed, tune2fs and dumpe2fs work 'as expected') and then goes to a backup and is happy, even declaring the fs clean. Therefore out of the 12 'affected' areas, one doesn't matter for practical purposes and we have to wonder about the others. Arguably, one of those should also be managed by parity but I have no idea how that will work out - it may be very important actually at the time of any future resync. Now, these are all in the first block of each device, which would form the first 1408 kB of the filesystem (128kB chunk, remember the original creation is *old*), since I believe mdraid preserves sequence, therefore the chunks are in order. We know the following from dumpe2fs: --- Group 0: (Blocks 0-32767) csum 0x45ff [ITABLE_ZEROED] Primary superblock at 0, Group descriptors at 1-2096 Block bitmap at 2260 (+2260), csum 0x824f8d47 Inode bitmap at 2261 (+2261), csum 0xdadef5ad Inode table at 2262-2773 (+2262) 0 free blocks, 8179 free inodes, 2 directories, 8179 unused inodes --- So the first 2097 blocks are backed up group descriptors - this is *way* more than the 1408 kB therefore with restored BGDs (fsck -s 32768, say) we should be... fine? Now, if OTOH I do an -nf, all sorts of weird stuff happens but I have to wonder whether that's because the BGDs are not happy. I am tempted to run an overlay *for the fsck*, what do you think? > Well, yes. But doesn't matter for assembly attempts, with always go by > the meta-data. Device order only ever matters for --create when recreating. Sure, but keep in mind, my --create commands nuked the original 0.90 metadata as well, so we need to be sure that the order is correct or we'll have a real jumble, Now, the cables have not been moved and the boot logs confirm that the scsi targets correspond, so we should have the order correct and the parameters are correct from the previous logs. Therefore, we 'should' have the same dataspa > If you consistently used -o or --assume-clean, then everything beyond > ~3G should be untouched, if you can get the order right. Have fsck try > backup superblocks way out. fsck grabs a backup 'magically' and seems to be happy, unless I -nf it then ... all sorts of bad stuff happens. > Please use lsdrv to capture names versus serial numbers. Re-run it > before any --create operation to ensure the current names really do > match the expected serial numbers. Keep track of ordering information > by serial number. Note that lsdrv will reliably line up PHYs on SAS > controllers, so that can be trusted, too. Thing is... I can't find lsdrv. As in: there is no lsdrv binary, apparently, in Debian stable or in Debian testing. Where do I look for it? > Superblocks other than 0.9x and 1.0 place a bad block log and a written > block bitmap between the superblock and the data area. I'm not sure if > any of the remain space is wiped. These would be written regardless of > -o or --assume-clean. Those flags "protect" the *data area* of the > array, not the array's own metadata. Yes - this is the damage I'm talking about above. From the logs, the 'area' is 4096 sectors of which 4016 remain 'unused'. Therefore 80 sectors, with the first 8 not being touched (and the proof is that the superblock is 'happy', though interestingly this should not be the case because the gr0 superblock is offset by 1024 bytes -> the last 1024 bytes of the superblock should be borked too. >From this, my math above. > > From, I think, the second --create of /dev/123, before I added the > > bitmap=none. This should, however, not have written anything with -o > > and --assume-clean, correct? > False assumption. As described above. Two different things: what I meant was that even with that bitmap message, the only thing that would have been written is the metadata. linux raid documentation states repeatedly that with -o no resyncing or parity reconstruction would be performed. Yes, agreed, the 1.2 metadata got written, but it's the only thing that got written from when the array was stopped by the error, if I am reading the docs correctly? > Okay. To date, you've only done create with -o or --assume-clean? > > If so, it is likely your 0.90 superblocks are still present at the ends > of the disks. Problem is, if you look at my previous email, as I mentioned above I have ALSO done --create with --metadata=0.90, which overwrote the original blocks. HOWEVER, I do have the logs of the original parameters and I have at least one drive - the old sdc - which was spit out before this whole thing, which becomes relevant to confirm that the parameter log is correct (multiple things seem to coincide, so I think we're OK there). Given all the above, however, if we get the parameters to match we should get a filesystem that corresponds to before the event after the first 1408kB - and those don't matter insofar as we have redundant backups in ext4 for at least the first 2060 blocks >> 1408 kB. The thing that I do NOT understand is that if this is the case, fsck with -s <high> should render a FS without any errors.. therefore why am I getting inode metadata checksum errors? This is why I had originarily posted in linux-ext4 ... Thanks, L