Re: filesystem dead, xfs_repair won't help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/10/2017 12:23 PM, Avi Kivity wrote:
Today my kernel complained that in memory metadata is corrupt and
asked that I run xfs_repair.  But xfs_repair doesn't like the
superblock and isn't able to find a secondary superblock.

Latest Fedora 25 kernel, new Intel NVMe drive (worked for a few weeks
without issue).

Anything I can do to recover the data?


Initial error:

Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block 0x2cb68e13 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and run xfs_repair Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 bytes of corrupted metadata buffer: Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75400: 23 40 8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75410: 62 87 57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75420: ae 7a ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: ffff9004a5b75430: e4 2e 14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): xfs_do_force_shutdown(0x8) called from line 236 of file fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc05bdbc6 Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption of in-memory data detected. Shutting down filesystem Apr 10 11:41:20 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please umount the filesystem and rectify the problem(s)


After restart:

Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Mounting V5 Filesystem Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Starting recovery (logdev: internal) Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Metadata CRC error detected at xfs_agfl_read_verify+0xcd/0x100 [xfs], xfs_agfl block 0x2cb68e13 Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Unmount and run xfs_repair Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): First 64 bytes of corrupted metadata buffer: Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a00: 23 40 8f 28 5b 50 3a b4 f8 54 1e 31 97 f4 fe ed #@.([P:..T.1.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a10: 62 87 57 51 ee 9d 31 02 ec 2c 10 46 6c 93 db 09 b.WQ..1..,.Fl... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a20: ae 7a ea b3 91 49 7e d3 99 a4 25 49 11 c5 8b be .z...I~...%I.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: ffff9450761d4a30: e4 2e 14 d4 8a f8 5f 98 66 d8 67 72 ec c9 1a d5 ......_.f.gr.... Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): metadata I/O error: block 0x2cb68e13 ("xfs_trans_read_buf_map") error 74 numblks 1 Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c. Caller xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: CPU: 3 PID: 1063 Comm: mount Not tainted 4.10.8-200.fc25.x86_64 #1 Apr 10 11:47:58 avi.cloudius-systems.com kernel: Hardware name: /DH77EB, BIOS EBH7710H.86A.0099.2013.0125.1400 01/25/2013
Apr 10 11:47:58 avi.cloudius-systems.com kernel: Call Trace:
Apr 10 11:47:58 avi.cloudius-systems.com kernel: dump_stack+0x63/0x86
Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_error_report+0x3c/0x40 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_trans_cancel+0xb6/0xe0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_efi_recover+0x18e/0x1c0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_process_efi+0x2c/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_process_intents.isra.42+0x122/0x160 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_reinit_percpu_counters+0x46/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xlog_recover_finish+0x23/0xb0 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_log_mount_finish+0x29/0x50 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_mountfs+0x6ce/0x930 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_fill_super+0x3ee/0x570 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_bdev+0x178/0x1b0
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? xfs_test_remount_options.isra.14+0x60/0x60 [xfs] Apr 10 11:47:58 avi.cloudius-systems.com kernel: xfs_fs_mount+0x15/0x20 [xfs]
Apr 10 11:47:58 avi.cloudius-systems.com kernel: mount_fs+0x38/0x150
Apr 10 11:47:58 avi.cloudius-systems.com kernel:  ? __alloc_percpu+0x15/0x20
Apr 10 11:47:58 avi.cloudius-systems.com kernel: vfs_kern_mount+0x67/0x130
Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_mount+0x1dd/0xc50
Apr 10 11:47:58 avi.cloudius-systems.com kernel: ? _copy_from_user+0x4e/0x80
Apr 10 11:47:58 avi.cloudius-systems.com kernel:  ? memdup_user+0x4f/0x70
Apr 10 11:47:58 avi.cloudius-systems.com kernel: SyS_mount+0x83/0xd0
Apr 10 11:47:58 avi.cloudius-systems.com kernel: do_syscall_64+0x67/0x180
Apr 10 11:47:58 avi.cloudius-systems.com kernel: entry_SYSCALL64_slow_path+0x25/0x25
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RIP: 0033:0x7f5cb9a626fa
Apr 10 11:47:58 avi.cloudius-systems.com kernel: RSP: 002b:00007ffeffa2c928 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 Apr 10 11:47:58 avi.cloudius-systems.com kernel: RAX: ffffffffffffffda RBX: 000055b59fd6f030 RCX: 00007f5cb9a626fa Apr 10 11:47:58 avi.cloudius-systems.com kernel: RDX: 000055b59fd6f210 RSI: 000055b59fd6f250 RDI: 000055b59fd6f230 Apr 10 11:47:58 avi.cloudius-systems.com kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000012 Apr 10 11:47:58 avi.cloudius-systems.com kernel: R10: 00000000c0ed0000 R11: 0000000000000246 R12: 000055b59fd6f230 Apr 10 11:47:58 avi.cloudius-systems.com kernel: R13: 000055b59fd6f210 R14: 0000000000000000 R15: 00000000ffffffff Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): xfs_do_force_shutdown(0x8) called from line 984 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffc056324f Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Corruption of in-memory data detected. Shutting down filesystem Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Please umount the filesystem and rectify the problem(s) Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): Failed to recover intents Apr 10 11:47:58 avi.cloudius-systems.com kernel: XFS (nvme0n1): log mount finish failed



smart (note error at end; there were no kernel I/O errors from the block layer):

$ sudo smartctl -a /dev/nvme0n1
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.10.8-200.fc25.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPEKKW512G7
Serial Number:                      BTPY6313086D512F
Firmware Version:                   PSF100C
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512,110,190,592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Mon Apr 10 12:36:41 2017 IDT
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0006):   Format Frmw_DL
Optional NVM Commands (0x001e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        5       5
 1 +     4.60W       -        -    1  1  1  1       30      30
 2 +     3.80W       -        -    2  2  2  2       30      30
 3 -   0.0700W       -        -    3  3  3  3    10000     300
 4 -   0.0050W       -        -    4  4  4  4     2000   10000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning:                   0x00
Temperature:                        27 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    8,854,487 [4.53 TB]
Data Units Written:                 5,652,445 [2.89 TB]
Host Read Commands:                 446,901,662
Host Write Commands:                35,627,742
Controller Busy Time:               633
Power Cycles:                       24
Power On Hours:                     987
Unsafe Shutdowns:                   16
Media and Data Integrity Errors:    1
Error Information Log Entries:      1
Warning  Comp. Temperature Time:    11
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0          1     1  0x0000  0x0286      -            0     1     -

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux