On Fri, Sep 01, 2017 at 08:48:03AM +0200, Ingard - wrote: > On Thu, Aug 31, 2017 at 12:20 PM, Brian Foster <bfoster@xxxxxxxxxx> wrote: > > On Thu, Aug 31, 2017 at 09:27:52AM +0200, Ingard - wrote: > >> On Wed, Aug 30, 2017 at 4:58 PM, Brian Foster <bfoster@xxxxxxxxxx> wrote: > >> > On Mon, Aug 21, 2017 at 10:24:32PM +0200, Ingard - wrote: > >> >> On Mon, Aug 21, 2017 at 5:51 PM, Brian Foster <bfoster@xxxxxxxxxx> wrote: > >> >> > On Mon, Aug 21, 2017 at 02:08:43PM +0200, Ingard - wrote: > >> >> >> On Fri, Aug 18, 2017 at 2:17 PM, Brian Foster <bfoster@xxxxxxxxxx> wrote: > >> >> >> > On Fri, Aug 18, 2017 at 07:02:24AM -0500, Bill O'Donnell wrote: > >> >> >> >> On Fri, Aug 18, 2017 at 01:56:31PM +0200, Ingard - wrote: > >> >> >> >> > After a server crash we've encountered a corrupt xfs filesystem. When > >> >> >> >> > trying to mount said filesystem normally the system hangs. > >> >> >> >> > This was initially on a ubuntu trusty server with 3.13 kernel with > >> >> >> >> > xfsprogs 3.1.9 > >> >> >> >> > > >> >> >> >> > We've installed a newer kernel (4.4.0-92) and compiled xfsprogs v > >> >> >> >> > 4.12.0 from source. We're still not able to mount the filesystem (and > >> >> >> >> > replay the log) normally. > >> >> >> >> > We are able to mount it -o ro,norecovery, but we're reluctant to do > >> >> >> >> > xfs_repair -L without trying everything we can first. The filesystem > >> >> >> >> > is browsable albeit a few paths which gives an error : "Structure > >> >> >> >> > needs cleaning" > >> >> >> >> > > >> >> >> >> > Does anyone have any advice as to how we might recover/repair the > >> >> >> >> > corrupt log so we can replay it? Or is xfs_repair -L the only way > >> >> >> >> > forward? > >> >> >> >> > >> >> >> >> Can you try xfs_repair -n (only scans the fs and reports what repairs > >> >> >> >> would be made)? > >> >> >> >> > >> >> >> > > >> >> >> > An xfs_metadump of the fs might be useful as well. Then we can see if we > >> >> >> > can reproduce the mount hang on latest kernels and if so, potentially > >> >> >> > try and root cause it. > >> >> >> > > >> >> >> > Brian > >> >> >> > >> >> >> Here is a link for the metadump : > >> >> >> https://www.jottacloud.com/p/ingardme/95ec2e45ba80431d962345981d38bdff > >> >> > > >> >> > This points to a 29GB image file, apparently uncompressed..? Could you > >> >> > upload a compressed file? Thanks. > >> >> > >> >> Hi. Sorry about that. Didnt realize the output would be compressable. > >> >> Here is a link to the compressed tgz (6G) > >> >> https://www.jottacloud.com/p/ingardme/cac6939649e14b98b928647f5222a2ae > >> >> > >> > > >> > I finally played around with this image a bit. Note that mount does not > >> > hang on latest kernels. Instead, log recovery emits a torn write message > >> > due to a bad crc at the head of the log and then ultimately fails due to > >> > a bad crc at the tail of the log. I ran a couple experiments to skip the > >> > bad crc records and/or to completely ignore all bad crc's and both still > >> > either fail to mount (due to other corruption) or continue to show > >> > corruption in the recovered fs. > >> > > >> > It's not clear to me what would have caused this corruption or log > >> > state. Have you encountered any corruption before? If not, is this kind > >> > of crash or unclean shutdown of the server an uncommon event? > >> We failed to notice the log messages of corrupt fs at first. After a > >> few days of these messages the filesystem got shut down due to > >> excessive? corruption. > >> At that point we tried to reboot normally, but ended up with having to > >> do a hard reset of the server. > >> It is not clear to us either why the corruption happened in the first > >> place either. The underlying raid has been in optimal state the whole > >> time > >> > > > > Ok, so corruption was the first problem. If the filesystem shutdown with > > something other than a log I/O error, chances are the log was flushed at > > that time. It is strange that log records end up corrupted, though not > > terribly out of the ordinary for the mount to ultimately fail if > > recovery stumbled over existing on-disk corruption, for instance. > > An xfs_repair was probably a foregone conclusion given the corruption > > started on disk, anyways. > > Out of curiosity, how long did the xfs_mdrestore command run ? I'm > pushing 20ish hours now and noticed the following in kern.log : > 2017-09-01T08:47:23.414139+02:00 dn-238 kernel: [1278740.983304] XFS: > xfs_mdrestore(5176) possible memory allocation deadlock size 37136 in > kmem_alloc (mode:0x2400240) > Heh. It certainly wasn't quick since it had to restore ~30GB or so of metadata, but it didn't take that long. If I had to guess, I'd say it restored within an hour. It seems like you're running into the in-core extent list problem, which causes pain for highly sparse or fragmented files because we store the entire extent list in memory. An fiemap of the restored image I have lying around shows over 1.5m extents. :/ You may need a box with more RAM (I had 32GB) or otherwise find another large enough block device to use the metadump. :/ If you had to bypass that step, you could at least run 'xfs_repair -n' on the original fs to see whether repair runs to completion in your environment. Brian > ingard > > > > > Brian > > > >> > > >> > That aside, I think the best course of action is to run 'xfs_repair -L' > >> > on the fs. I ran a v4.12 version against the metadump image and it > >> > successfully repaired the fs. I've attached the repair output for > >> > reference, but I would recommend to first restore your metadump to a > >> > temporary location, attempt to repair that and examine the results > >> > before repairing the original fs. Note that the metadump will not have > >> > any file content, but will represent which files might be cleared, moved > >> > to lost+found, etc. > >> Ok. Thanks for looking into it. We'll proceed with the suggested > >> course of action. > >> > >> ingard > >> > > >> > Brian > >> > > >> >> > > >> >> > Brian > >> >> > > >> >> >> And the repair -n output : > >> >> >> https://www.jottacloud.com/p/ingardme/0205c6ca6f7e495ebcda5f255b96f63d > >> >> >> > >> >> >> kind regards > >> >> >> ingard > >> >> >> > >> >> >> > > >> >> >> >> Thanks- > >> >> >> >> Bill > >> >> >> >> > >> >> >> >> > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > Excerpt from kern.log: > >> >> >> >> > 2017-08-17T13:40:41.122121+02:00 dn-238 kernel: [ 294.300347] XFS > >> >> >> >> > (sdd1): Mounting V4 filesystem in no-recovery mode. Filesystem will be > >> >> >> >> > inconsistent. > >> >> >> >> > > >> >> >> >> > 2017-08-17T17:04:54.794194+02:00 dn-238 kernel: [12548.400260] XFS > >> >> >> >> > (sdd1): Metadata corruption detected at xfs_inode_buf_verify+0x6f/0xd0 > >> >> >> >> > [xfs], xfs_inode block 0x81c9c210 > >> >> >> >> > 2017-08-17T17:04:54.794216+02:00 dn-238 kernel: [12548.400342] XFS > >> >> >> >> > (sdd1): Unmount and run xfs_repair > >> >> >> >> > 2017-08-17T17:04:54.794218+02:00 dn-238 kernel: [12548.400374] XFS > >> >> >> >> > (sdd1): First 64 bytes of corrupted metadata buffer: > >> >> >> >> > 2017-08-17T17:04:54.794220+02:00 dn-238 kernel: [12548.400418] > >> >> >> >> > ffff880171fff000: 3f 1a 33 54 5b 55 85 0b 7c f5 c6 d5 cf 51 47 41 > >> >> >> >> > ?.3T[U..|....QGA > >> >> >> >> > 2017-08-17T17:04:54.794222+02:00 dn-238 kernel: [12548.400473] > >> >> >> >> > ffff880171fff010: 97 ba ba 03 5c e4 02 7a e6 bc fb 5d f1 72 db c1 > >> >> >> >> > ....\..z...].r.. > >> >> >> >> > 2017-08-17T17:04:54.794223+02:00 dn-238 kernel: [12548.400527] > >> >> >> >> > ffff880171fff020: c8 ad 3a 76 c7 e4 20 92 88 a2 35 0c 1f 36 cf b5 > >> >> >> >> > ..:v.. ...5..6.. > >> >> >> >> > 2017-08-17T17:04:54.794226+02:00 dn-238 kernel: [12548.400581] > >> >> >> >> > ffff880171fff030: 8a bc 42 75 86 50 a0 a2 be 2c 2d 99 96 2d e1 ee > >> >> >> >> > ..Bu.P...,-..-.. > >> >> >> >> > > >> >> >> >> > kind regards > >> >> >> >> > ingard > >> >> >> >> > -- > >> >> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > >> >> >> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx > >> >> >> >> > More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >> >> >> -- > >> >> >> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > >> >> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> >> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >> >> -- > >> >> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > >> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> >> -- > >> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html