On 04/10/2017 12:23 PM, Avi Kivity wrote:
Today my kernel complained that in memory metadata is corrupt and
asked that I run xfs_repair. But xfs_repair doesn't like the
superblock and isn't able to find a secondary superblock.
Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks
without issue).
Anything I can do to recover the data?
Initial error:
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata
CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl
block 0x2cb68e13
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount
and run xfs_repair
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64
bytes of corrupted metadata buffer:
Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75400: 23 40
8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1....
Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75410: 62 87
57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl...
Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75420: ae 7a
ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I....
Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75430: e4 2e
14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr....
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata
I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1):
xfs_do_force_shutdown(0x8) called from line 236 of file
fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc05bdbc6
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1):
Corruption of in-memory data detected. Shutting down filesystem
Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please
umount the filesystem and rectify the problem(s)
After restart:
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Mounting
V5 Filesystem
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Starting
recovery (logdev: internal)
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata
CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl
block 0x2cb68e13
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount
and run xfs_repair
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64
bytes of corrupted metadata buffer:
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a00: 23 40
8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1....
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a10: 62 87
57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl...
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a20: ae 7a
ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I....
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a30: e4 2e
14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr....
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata
I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Internal
error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c. Caller
xfs_efi_recover+0x18e/0x1c0 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: CPU: 3 PID: 1063 Comm:
mount Not tainted 4.10.8-200.fc25.x86_64 #1
Apr 10 11:47:58 avi.cloudius-systems.com kernel: Hardware
name: /DH77EB, BIOS EBH7710H.86A.0099.2013.0125.1400
01/25/2013
Apr 10 11:47:58 avi.cloudius-systems.com kernel: Call Trace:
Apr 10 11:47:58 avi.cloudius-systems.com kernel: dump_stack+0x63/0x86
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xfs_error_report+0x3c/0x40 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ?
xfs_efi_recover+0x18e/0x1c0 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xfs_trans_cancel+0xb6/0xe0 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xfs_efi_recover+0x18e/0x1c0 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xlog_recover_process_efi+0x2c/0x50 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xlog_recover_process_intents.isra.42+0x122/0x160 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ?
xfs_reinit_percpu_counters+0x46/0x50 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xlog_recover_finish+0x23/0xb0 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xfs_log_mount_finish+0x29/0x50 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_mountfs+0x6ce/0x930
[xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
xfs_fs_fill_super+0x3ee/0x570 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_bdev+0x178/0x1b0
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ?
xfs_test_remount_options.isra.14+0x60/0x60 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_mount+0x15/0x20
[xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_fs+0x38/0x150
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? __alloc_percpu+0x15/0x20
Apr 10 11:47:58 avi.cloudius-systems.com kernel: vfs_kern_mount+0x67/0x130
Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_mount+0x1dd/0xc50
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ?
_copy_from_user+0x4e/0x80
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? memdup_user+0x4f/0x70
Apr 10 11:47:58 avi.cloudius-systems.com kernel: SyS_mount+0x83/0xd0
Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_syscall_64+0x67/0x180
Apr 10 11:47:58 avi.cloudius-systems.com kernel:
entry_SYSCALL64_slow_path+0x25/0x25
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RIP: 0033:0x7f5cb9a626fa
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RSP:
002b:00007ffeffa2c928 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RAX: ffffffffffffffda
RBX: 000055b59fd6f030 RCX: 00007f5cb9a626fa
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RDX: 000055b59fd6f210
RSI: 000055b59fd6f250 RDI: 000055b59fd6f230
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RBP: 0000000000000000
R08: 0000000000000000 R09: 0000000000000012
Apr 10 11:47:58 avi.cloudius-systems.com kernel: R10: 00000000c0ed0000
R11: 0000000000000246 R12: 000055b59fd6f230
Apr 10 11:47:58 avi.cloudius-systems.com kernel: R13: 000055b59fd6f210
R14: 0000000000000000 R15: 00000000ffffffff
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1):
xfs_do_force_shutdown(0x8) called from line 984 of file
fs/xfs/xfs_trans.c. Return address = 0xffffffffc056324f
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1):
Corruption of in-memory data detected. Shutting down filesystem
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please
umount the filesystem and rectify the problem(s)
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Failed
to recover intents
Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): log
mount finish failed
smart (note error at end; there were no kernel I/O errors from the block
layer):
$ sudo smartctl -a /dev/nvme0n1
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.10.8-200.fc25.x86_64]
(local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: INTEL SSDPEKKW512G7
Serial Number: BTPY6313086D512F
Firmware Version: PSF100C
PCI Vendor/Subsystem ID: 0x8086
IEEE OUI Identifier: 0x5cd2e4
Controller ID: 1
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512,110,190,592 [512 GB]
Namespace 1 Formatted LBA Size: 512
Local Time is: Mon Apr 10 12:36:41 2017 IDT
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0006): Format Frmw_DL
Optional NVM Commands (0x001e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size: 32 Pages
Warning Comp. Temp. Threshold: 70 Celsius
Critical Comp. Temp. Threshold: 80 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 9.00W - - 0 0 0 0 5 5
1 + 4.60W - - 1 1 1 1 30 30
2 + 3.80W - - 2 2 2 2 30 30
3 - 0.0700W - - 3 3 3 3 10000 300
4 - 0.0050W - - 4 4 4 4 2000 10000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning: 0x00
Temperature: 27 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 8,854,487 [4.53 TB]
Data Units Written: 5,652,445 [2.89 TB]
Host Read Commands: 446,901,662
Host Write Commands: 35,627,742
Controller Busy Time: 633
Power Cycles: 24
Power On Hours: 987
Unsafe Shutdowns: 16
Media and Data Integrity Errors: 1
Error Information Log Entries: 1
Warning Comp. Temperature Time: 11
Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, max 64 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
0 1 1 0x0000 0x0286 - 0 1 -
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html