Re: XFS on RBD crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 9, 2017 at 4:01 PM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote:
> I have experienced a crash today (in a sense of filesystem going
> offline) of a 25TB XFS filesystem.  Tried searching the list and
> google, and not much specific info I can use, so very much appreciate
> any insight:
>
> System: Ubuntu 16.04, kernel 4.10.17-041017-generic
>
> Mount info:
>
> /dev/rbd0 on /srv/exports/sclun63 type xfs
> (rw,relatime,attr2,inode64,logbsize=256k,sunit=8192,swidth=8192,noquota)
>
> Failure trace:
>
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.769776] XFS (rbd1):
> Internal error XFS_WANT_CORRUPTED_GOTO at line 3505 of file
> /home/kernel/COD/linux/fs/xfs/libxfs/xfs_btree.c.  Caller
> xfs_free_ag_extent+0x352/0x740 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.769890] CPU: 20 PID:
> 724783 Comm: kworker/20:4 Tainted: G           OE
> 4.10.17-041017-generic #201705201051
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.769963] Hardware name:
> Dell Inc. Precision Rack 7910/01J90F, BIOS 1.1.4 11/04/2014
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770058] Workqueue:
> xfs-conv/rbd1 xfs_end_io [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770098] Call Trace:
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770136]  dump_stack+0x63/0x81
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770198]
> xfs_error_report+0x3c/0x40 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770257]  ?
> xfs_free_ag_extent+0x352/0x740 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770317]
> xfs_btree_insert+0x16d/0x1c0 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770375]
> xfs_free_ag_extent+0x352/0x740 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770433]
> xfs_free_extent+0xa8/0x140 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770501]
> xfs_trans_free_extent+0x43/0x110 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770568]
> xfs_extent_free_finish_item+0x26/0x40 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770630]
> xfs_defer_finish+0x151/0x430 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770694]
> xfs_iomap_write_unwritten+0xd3/0x2e0 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770737]  ?
> propagate_entity_cfs_rq.isra.66+0x271/0xa30
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770802]
> xfs_end_io+0xa0/0xb0 [xfs]
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770840]
> process_one_work+0x1fc/0x4b0
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770877]  worker_thread+0x4b/0x500
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770914]  kthread+0x109/0x140
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770949]  ?
> process_one_work+0x4b0/0x4b0
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.770987]  ?
> kthread_create_on_node+0x60/0x60
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.771028]  ret_from_fork+0x2c/0x40
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.771093] XFS (rbd1):
> xfs_do_force_shutdown(0x8) called from line 236 of file
> /home/kernel/COD/linux/fs/xfs/libxfs/xfs_defer.c.  Return address =
> 0xffffffffc0f0d366
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.772151] XFS (rbd1):
> Corruption of in-memory data detected.  Shutting down filesystem
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.772219] XFS (rbd1):
> Please umount the filesystem and rectify the problem(s)
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.772292] Buffer I/O
> error on dev rbd1, logical block 5078005778, lost async page write
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.772362] Buffer I/O
> error on dev rbd1, logical block 5078005779, lost async page write
> Dec  9 14:05:45 roc01r-scd224 kernel: [3455811.777755] Buffer I/O
> error on dev rbd1, logical block 5078005780, lost async page write
>
> xfs_repair (had to do -L):
>
> root@roc01r-scd224:~# xfs_repair -L /dev/rbd0
> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
>         - zero log...
> ALERT: The filesystem has valuable metadata changes in a log which is being
> destroyed because the -L option was used.
>         - scan filesystem freespace and inode maps...
> freeblk count 3 != flcount 4 in ag 47
> freeblk count 3 != flcount 4 in ag 45
> freeblk count 3 != flcount 4 in ag 46
> freeblk count 3 != flcount 4 in ag 53
> freeblk count 3 != flcount 4 in ag 49
> freeblk count 3 != flcount 4 in ag 52
> freeblk count 3 != flcount 4 in ag 56
> freeblk count 3 != flcount 4 in ag 51
> freeblk count 3 != flcount 4 in ag 48
> freeblk count 3 != flcount 4 in ag 57
> freeblk count 3 != flcount 4 in ag 54
> freeblk count 3 != flcount 4 in ag 50
> freeblk count 3 != flcount 4 in ag 58
> freeblk count 3 != flcount 4 in ag 55
> freeblk count 3 != flcount 4 in ag 59
> freeblk count 3 != flcount 4 in ag 61
> freeblk count 3 != flcount 4 in ag 62
> freeblk count 3 != flcount 4 in ag 63
> freeblk count 3 != flcount 4 in ag 60
> freeblk count 3 != flcount 4 in ag 70
> freeblk count 3 != flcount 4 in ag 65
> freeblk count 3 != flcount 4 in ag 66
> freeblk count 3 != flcount 4 in ag 71
> freeblk count 3 != flcount 4 in ag 68
> freeblk count 3 != flcount 4 in ag 69
> freeblk count 3 != flcount 4 in ag 64
> freeblk count 3 != flcount 4 in ag 73
> freeblk count 3 != flcount 4 in ag 67
> freeblk count 3 != flcount 4 in ag 78
> freeblk count 3 != flcount 4 in ag 76
> freeblk count 3 != flcount 4 in ag 72
> freeblk count 3 != flcount 4 in ag 80
> freeblk count 3 != flcount 4 in ag 83
> freeblk count 3 != flcount 4 in ag 84
> freeblk count 3 != flcount 4 in ag 75
> freeblk count 3 != flcount 4 in ag 74
> freeblk count 3 != flcount 4 in ag 85
> freeblk count 3 != flcount 4 in ag 79
> freeblk count 3 != flcount 4 in ag 81
> freeblk count 3 != flcount 4 in ag 77
> freeblk count 3 != flcount 4 in ag 82
> freeblk count 3 != flcount 4 in ag 88
> freeblk count 3 != flcount 4 in ag 91
> freeblk count 3 != flcount 4 in ag 86
> freeblk count 3 != flcount 4 in ag 87
> freeblk count 3 != flcount 4 in ag 93
> freeblk count 3 != flcount 4 in ag 92
> freeblk count 3 != flcount 4 in ag 90
> freeblk count 3 != flcount 4 in ag 94
> freeblk count 3 != flcount 4 in ag 89
> freeblk count 3 != flcount 4 in ag 97
> freeblk count 3 != flcount 4 in ag 95
> freeblk count 3 != flcount 4 in ag 96
> freeblk count 3 != flcount 4 in ag 98
> freeblk count 3 != flcount 4 in ag 100
> freeblk count 3 != flcount 4 in ag 99
> freeblk count 3 != flcount 4 in ag 103
> freeblk count 3 != flcount 4 in ag 101
> freeblk count 3 != flcount 4 in ag 104
> freeblk count 3 != flcount 4 in ag 102
> freeblk count 3 != flcount 4 in ag 105
> freeblk count 3 != flcount 4 in ag 110
> freeblk count 3 != flcount 4 in ag 106
> freeblk count 3 != flcount 4 in ag 113
> freeblk count 3 != flcount 4 in ag 115
> freeblk count 3 != flcount 4 in ag 111
> freeblk count 3 != flcount 4 in ag 109
> freeblk count 3 != flcount 4 in ag 107
> freeblk count 3 != flcount 4 in ag 116
> freeblk count 3 != flcount 4 in ag 114
> freeblk count 3 != flcount 4 in ag 108
> freeblk count 3 != flcount 4 in ag 117
> freeblk count 3 != flcount 4 in ag 124
> freeblk count 3 != flcount 4 in ag 119
> freeblk count 3 != flcount 4 in ag 118
> freeblk count 3 != flcount 4 in ag 120
> freeblk count 3 != flcount 4 in ag 121
> freeblk count 3 != flcount 4 in ag 122
> freeblk count 3 != flcount 4 in ag 126
> freeblk count 3 != flcount 4 in ag 125
> freeblk count 3 != flcount 4 in ag 127
> freeblk count 3 != flcount 4 in ag 131
> freeblk count 3 != flcount 4 in ag 123
> freeblk count 3 != flcount 4 in ag 130
> freeblk count 3 != flcount 4 in ag 133
> freeblk count 3 != flcount 4 in ag 128
> freeblk count 3 != flcount 4 in ag 137
> freeblk count 3 != flcount 4 in ag 136
> freeblk count 3 != flcount 4 in ag 138
> freeblk count 3 != flcount 4 in ag 135
> freeblk count 3 != flcount 4 in ag 132
> freeblk count 3 != flcount 4 in ag 139
> freeblk count 3 != flcount 4 in ag 134
> freeblk count 3 != flcount 4 in ag 129
> freeblk count 3 != flcount 4 in ag 145
> freeblk count 3 != flcount 4 in ag 147
> freeblk count 3 != flcount 4 in ag 144
> freeblk count 3 != flcount 4 in ag 140
> freeblk count 3 != flcount 4 in ag 149
> freeblk count 3 != flcount 4 in ag 143
> freeblk count 3 != flcount 4 in ag 142
> freeblk count 3 != flcount 4 in ag 141
> freeblk count 3 != flcount 4 in ag 152
> freeblk count 3 != flcount 4 in ag 148
> freeblk count 3 != flcount 4 in ag 151
> freeblk count 3 != flcount 4 in ag 146
> freeblk count 3 != flcount 4 in ag 154
> freeblk count 3 != flcount 4 in ag 150
> freeblk count 3 != flcount 4 in ag 155
> freeblk count 3 != flcount 4 in ag 157
> freeblk count 3 != flcount 4 in ag 156
> freeblk count 3 != flcount 4 in ag 160
> freeblk count 3 != flcount 4 in ag 158
> freeblk count 3 != flcount 4 in ag 153
> freeblk count 3 != flcount 4 in ag 159
> sb_ifree 667, counted 615
> sb_fdblocks 3930698117, counted 1111093367
>         - 15:06:10: scanning filesystem freespace - 193 of 193
> allocation groups done
>         - found root inode chunk
> Phase 3 - for each AG...
>         - scan and clear agi unlinked lists...
>         - 15:06:10: scanning agi unlinked lists - 193 of 193
> allocation groups done
>         - process known inodes and perform inode discovery...
>         - agno = 0
>         - agno = 30
>         - agno = 60
>         - agno = 15
>         - agno = 75
>         - agno = 45
>         - agno = 150
>         - agno = 31
>         - agno = 105
>         - agno = 61
>         - agno = 46
>         - agno = 135
>         - agno = 76
>         - agno = 120
>         - agno = 151
>         - agno = 90
>         - agno = 180
>         - agno = 165
>         - agno = 91
>         - agno = 32
>         - agno = 62
>         - agno = 106
>         - agno = 121
>         - agno = 166
>         - agno = 136
>         - agno = 47
>         - agno = 92
>         - agno = 16
>         - agno = 1
>         - agno = 181
>         - agno = 33
>         - agno = 167
>         - agno = 93
>         - agno = 122
>         - agno = 63
>         - agno = 152
>         - agno = 77
>         - agno = 2
>         - agno = 168
>         - agno = 48
>         - agno = 34
>         - agno = 182
>         - agno = 123
>         - agno = 107
>         - agno = 78
>         - agno = 153
>         - agno = 17
>         - agno = 3
>         - agno = 137
>         - agno = 35
>         - agno = 64
>         - agno = 79
>         - agno = 169
>         - agno = 138
>         - agno = 183
>         - agno = 108
>         - agno = 94
>         - agno = 65
>         - agno = 124
>         - agno = 36
>         - agno = 154
>         - agno = 80
>         - agno = 49
>         - agno = 184
>         - agno = 109
>         - agno = 37
>         - agno = 95
>         - agno = 125
>         - agno = 170
>         - agno = 50
>         - agno = 66
>         - agno = 81
>         - agno = 110
>         - agno = 139
>         - agno = 155
>         - agno = 185
>         - agno = 96
>         - agno = 126
>         - agno = 38
>         - agno = 171
>         - agno = 127
>         - agno = 140
>         - agno = 97
>         - agno = 51
>         - agno = 128
>         - agno = 67
>         - agno = 111
>         - agno = 82
>         - agno = 156
>         - agno = 186
>         - agno = 141
>         - agno = 172
>         - agno = 98
>         - agno = 129
>         - agno = 52
>         - agno = 130
>         - agno = 39
>         - agno = 142
>         - agno = 173
>         - agno = 53
>         - agno = 99
>         - agno = 157
>         - agno = 68
>         - agno = 112
>         - agno = 40
>         - agno = 83
>         - agno = 131
>         - agno = 54
>         - agno = 187
>         - agno = 113
>         - agno = 69
>         - agno = 143
>         - agno = 174
>         - agno = 84
>         - agno = 158
>         - agno = 41
>         - agno = 100
>         - agno = 159
>         - agno = 175
>         - agno = 144
>         - agno = 188
>         - agno = 42
>         - agno = 85
>         - agno = 70
>         - agno = 132
>         - agno = 101
>         - agno = 114
>         - agno = 55
>         - agno = 176
>         - agno = 160
>         - agno = 145
>         - agno = 86
>         - agno = 71
>         - agno = 189
>         - agno = 102
>         - agno = 43
>         - agno = 115
>         - agno = 161
>         - agno = 177
>         - agno = 56
>         - agno = 133
>         - agno = 72
>         - agno = 103
>         - agno = 44
>         - agno = 87
>         - agno = 190
>         - agno = 134
>         - agno = 73
>         - agno = 162
>         - agno = 116
>         - agno = 57
>         - agno = 146
>         - agno = 74
>         - agno = 191
>         - agno = 104
>         - agno = 117
>         - agno = 147
>         - agno = 163
>         - agno = 88
>         - agno = 178
>         - agno = 58
>         - agno = 192
>         - agno = 118
>         - agno = 148
>         - agno = 89
>         - agno = 164
>         - agno = 59
>         - agno = 179
>         - agno = 119
>         - agno = 149
>         - agno = 4
>         - agno = 5
>         - agno = 18
>         - agno = 6
>         - agno = 7
>         - agno = 8
>         - agno = 9
>         - agno = 10
>         - agno = 11
>         - agno = 12
>         - agno = 13
>         - agno = 14
>         - agno = 19
>         - agno = 20
>         - agno = 21
>         - agno = 22
>         - agno = 23
>         - agno = 24
>         - agno = 25
>         - agno = 26
>         - agno = 27
>         - agno = 28
>         - agno = 29
>         - 15:10:19: process known inodes and inode discovery - 896 of
> 896 inodes done
>         - process newly discovered inodes...
>         - 15:10:19: process newly discovered inodes - 193 of 193
> allocation groups done
> Phase 4 - check for duplicate blocks...
>         - setting up duplicate extent list...
>         - 15:10:19: setting up duplicate extent list - 193 of 193
> allocation groups done
>         - check for inodes claiming duplicate blocks...
>         - agno = 1
>         - agno = 0
>         - agno = 2
>         - agno = 3
>         - agno = 9
>         - agno = 12
>         - agno = 6
>         - agno = 8
>         - agno = 4
>         - agno = 10
>         - agno = 7
>         - agno = 13
>         - agno = 31
>         - agno = 29
>         - agno = 35
>         - agno = 38
>         - agno = 40
>         - agno = 42
>         - agno = 43
>         - agno = 45
>         - agno = 18
>         - agno = 19
>         - agno = 21
>         - agno = 22
>         - agno = 23
>         - agno = 24
>         - agno = 26
>         - agno = 51
>         - agno = 52
>         - agno = 25
>         - agno = 53
>         - agno = 54
>         - agno = 55
>         - agno = 32
>         - agno = 14
>         - agno = 33
>         - agno = 61
>         - agno = 62
>         - agno = 30
>         - agno = 5
>         - agno = 65
>         - agno = 66
>         - agno = 37
>         - agno = 16
>         - agno = 39
>         - agno = 68
>         - agno = 69
>         - agno = 70
>         - agno = 17
>         - agno = 44
>         - agno = 73
>         - agno = 74
>         - agno = 75
>         - agno = 76
>         - agno = 77
>         - agno = 78
>         - agno = 79
>         - agno = 50
>         - agno = 27
>         - agno = 81
>         - agno = 28
>         - agno = 11
>         - agno = 34
>         - agno = 85
>         - agno = 86
>         - agno = 59
>         - agno = 87
>         - agno = 60
>         - agno = 63
>         - agno = 90
>         - agno = 91
>         - agno = 64
>         - agno = 93
>         - agno = 94
>         - agno = 95
>         - agno = 41
>         - agno = 71
>         - agno = 72
>         - agno = 36
>         - agno = 20
>         - agno = 46
>         - agno = 47
>         - agno = 102
>         - agno = 48
>         - agno = 104
>         - agno = 105
>         - agno = 106
>         - agno = 108
>         - agno = 84
>         - agno = 110
>         - agno = 56
>         - agno = 112
>         - agno = 113
>         - agno = 115
>         - agno = 116
>         - agno = 117
>         - agno = 118
>         - agno = 96
>         - agno = 97
>         - agno = 98
>         - agno = 99
>         - agno = 100
>         - agno = 101
>         - agno = 103
>         - agno = 49
>         - agno = 126
>         - agno = 127
>         - agno = 107
>         - agno = 82
>         - agno = 83
>         - agno = 131
>         - agno = 109
>         - agno = 133
>         - agno = 111
>         - agno = 134
>         - agno = 57
>         - agno = 58
>         - agno = 89
>         - agno = 92
>         - agno = 67
>         - agno = 142
>         - agno = 143
>         - agno = 119
>         - agno = 120
>         - agno = 145
>         - agno = 146
>         - agno = 123
>         - agno = 148
>         - agno = 149
>         - agno = 150
>         - agno = 151
>         - agno = 152
>         - agno = 129
>         - agno = 154
>         - agno = 130
>         - agno = 157
>         - agno = 135
>         - agno = 136
>         - agno = 88
>         - agno = 114
>         - agno = 137
>         - agno = 162
>         - agno = 164
>         - agno = 139
>         - agno = 140
>         - agno = 166
>         - agno = 141
>         - agno = 168
>         - agno = 122
>         - agno = 171
>         - agno = 124
>         - agno = 173
>         - agno = 174
>         - agno = 128
>         - agno = 176
>         - agno = 177
>         - agno = 155
>         - agno = 156
>         - agno = 132
>         - agno = 180
>         - agno = 158
>         - agno = 159
>         - agno = 160
>         - agno = 161
>         - agno = 186
>         - agno = 138
>         - agno = 188
>         - agno = 189
>         - agno = 15
>         - agno = 144
>         - agno = 191
>         - agno = 192
>         - agno = 170
>         - agno = 147
>         - agno = 172
>         - agno = 125
>         - agno = 80
>         - agno = 175
>         - agno = 153
>         - agno = 178
>         - agno = 179
>         - agno = 181
>         - agno = 182
>         - agno = 183
>         - agno = 184
>         - agno = 185
>         - agno = 163
>         - agno = 187
>         - agno = 165
>         - agno = 167
>         - agno = 190
>         - agno = 169
>         - agno = 121
>         - 15:10:22: check for inodes claiming duplicate blocks - 896
> of 896 inodes done
> Phase 5 - rebuild AG headers and trees...
>         - 15:10:22: rebuild AG headers and trees - 193 of 193
> allocation groups done
>         - reset superblock...
> Phase 6 - check inode connectivity...
>         - resetting contents of realtime bitmap and summary inodes
>         - traversing filesystem ...
>         - traversal finished ...
>         - moving disconnected inodes to lost+found ...
> Phase 7 - verify and correct link counts...
> Maximum metadata LSN (792:1294080) is ahead of log (1:64).
> Format log to cycle 795.
> done
> root@roc01r-scd224:~# mount -t xfs /dev/rbd0 /mnt
>
>
> No other errors in logs, Ceph or hardware.
>
>
> Thank you in advance!

Forgot to include xfs_info output:

xfs_info /srv/exports/sclun63
meta-data=/dev/rbd0              isize=512    agcount=193, agsize=33553408 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0
data     =                       bsize=4096   blocks=6442450944, imaxpct=5
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux