Having attempted to recover using the journal tool and having that fail we are goinig to rebuild our metadata using a separate metadata pool.
We have the following procedure we are going to use. The issue I haven't found yet (likely lack of sleep) is how to replace the original metadata pool in the cephfs so we can continue to use the default name. Then how we remove the secondary file system.
# ceph fs
ceph fs flag set enable_multiple true --yes-i-really-mean-it
ceph osd pool create recovery 512 replicated replicated_ruleset
ceph fs new recovery-fs recovery cephfs-cold --allow-dangerous-metadata-overlay
cephfs-data-scan init --force-init --filesystem recovery-fs --alternate-pool recovery
ceph fs reset recovery-fs --yes-i-really-mean-it
# create structure
cephfs-table-tool recovery-fs:all reset session
cephfs-table-tool recovery-fs:all reset snap
cephfs-table-tool recovery-fs:all reset inode
# build new metadata
# scan_extents
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 0 --worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 1 --worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 2 --worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 3 --worker_m 4 --filesystem cephfs cephfs-cold
# scan inodes
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0 --worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0 --worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0 --worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0 --worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_links --filesystem recovery-fs
Thanks
Rhian
On Fri, Nov 2, 2018 at 9:47 PM Rhian Resnick <xantho@xxxxxxxxxxxx> wrote:
I was posting with my office account but I think it is being blocked.Our cephfs's metadata pool went from 1GB to 1TB in a matter of hours and after using all storage on the OSD's reports two damaged ranks.The cephfs-journal-tool crashes when performing any operations due to memory utilization.We tried a backup which crashed (we then did a rados cppool to backup our metadata).I then tried to run a dentry recovery which failed due to memory usage.Any recommendations for the next step?Data from our config and status
Combined logs (after marking things as repaired to see if that would rescue us): Nov 1 10:07:02 ceph-p-mds2 ceph-mds: 2018-11-01 10:07:02.045499 7f68db7a3700 -1 mds.4.purge_queue operator(): Error -108 loading Journaler Nov 1 10:07:02 ceph-p-mds2 ceph-mds: 2018-11-01 10:07:02.045499 7f68db7a3700 -1 mds.4.purge_queue operator(): Error -108 loading Journaler Nov 1 10:26:40 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:40.968143 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE) Nov 1 10:26:40 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:40.968143 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE) Nov 1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914934 7f6dacd69700 -1 mds.1.journaler.mdlog(ro) try_read_entry: decode error from _is_readable Nov 1 10:26:47 ceph-storage2 ceph-mds: mds.1 10.141.255.202:6898/1492854021 1 : Error loading MDS rank 1: (22) Invalid argument Nov 1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914949 7f6dacd69700 0 mds.1.log _replay journaler got error -22, aborting Nov 1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914934 7f6dacd69700 -1 mds.1.journaler.mdlog(ro) try_read_entry: decode error from _is_readable Nov 1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.915745 7f6dacd69700 -1 log_channel(cluster) log [ERR] : Error loading MDS rank 1: (22) Invalid argument Nov 1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.915745 7f6dacd69700 -1 log_channel(cluster) log [ERR] : Error loading MDS rank 1: (22) Invalid argument Nov 1 10:26:47 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:47.999432 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 2 mds daemons damaged (MDS_DAMAGE) Nov 1 10:26:47 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:47.999432 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 2 mds daemons damaged (MDS_DAMAGE) Nov 1 10:26:55 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:55.026231 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE) Nov 1 10:26:55 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:55.026231 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE) Ceph OSD Status: (The missing and oud osd's are in a different pool from all data, these were the bad ssds that caused the issue) cluster: id: 6a2e8f21-bca2-492b-8869-eecc995216cc health: HEALTH_ERR 1 filesystem is degraded 2 mds daemons damaged services: mon: 3 daemons, quorum ceph-p-mon2,ceph-p-mon1,ceph-p-mon3 mgr: ceph-p-mon1(active), standbys: ceph-p-mon2 mds: cephfs-3/5/5 up {0=ceph-storage3=up:resolve,2=ceph-p-mon3=up:resolve,4=ceph-p-mds1=up:resolve}, 3 up:standby, 2 damaged osd: 170 osds: 167 up, 158 in data: pools: 7 pools, 7520 pgs objects: 188.46M objects, 161TiB usage: 275TiB used, 283TiB / 558TiB avail pgs: 7511 active+clean 9 active+clean+scrubbing+deep io: client: 0B/s rd, 17.2KiB/s wr, 0op/s rd, 1op/s wr Ceph OSD Tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -10 0 root deefault -9 5.53958 root ssds -11 1.89296 host ceph-cache1 35 hdd 1.09109 osd.35 up 0 1.00000 181 hdd 0.26729 osd.181 up 0 1.00000 182 hdd 0.26729 osd.182 down 0 1.00000 183 hdd 0.26729 osd.183 down 0 1.00000 -12 1.75366 host ceph-cache2 46 hdd 1.09109 osd.46 up 0 1.00000 185 hdd 0.26729 osd.185 down 0 1.00000 186 hdd 0.12799 osd.186 up 0 1.00000 187 hdd 0.26729 osd.187 up 0 1.00000 -13 1.89296 host ceph-cache3 60 hdd 1.09109 osd.60 up 0 1.00000 189 hdd 0.26729 osd.189 up 0 1.00000 190 hdd 0.26729 osd.190 up 0 1.00000 191 hdd 0.26729 osd.191 up 0 1.00000 -5 4.33493 root ssds-ro -6 1.44498 host ceph-storage1-ssd 85 ssd 0.72249 osd.85 up 1.00000 1.00000 89 ssd 0.72249 osd.89 up 1.00000 1.00000 -7 1.44498 host ceph-storage2-ssd 5 ssd 0.72249 osd.5 up 1.00000 1.00000 68 ssd 0.72249 osd.68 up 1.00000 1.00000 -8 1.44498 host ceph-storage3-ssd 160 ssd 0.72249 osd.160 up 1.00000 1.00000 163 ssd 0.72249 osd.163 up 1.00000 1.00000 -1 552.07568 root default -2 177.96744 host ceph-storage1 0 hdd 3.63199 osd.0 up 1.00000 1.00000 1 hdd 3.63199 osd.1 up 1.00000 1.00000 3 hdd 3.63199 osd.3 up 1.00000 1.00000 4 hdd 3.63199 osd.4 up 1.00000 1.00000 6 hdd 3.63199 osd.6 up 1.00000 1.00000 8 hdd 3.63199 osd.8 up 1.00000 1.00000 11 hdd 3.63199 osd.11 up 1.00000 1.00000 13 hdd 3.63199 osd.13 up 1.00000 1.00000 15 hdd 3.63199 osd.15 up 1.00000 1.00000 18 hdd 3.63199 osd.18 up 1.00000 1.00000 20 hdd 3.63199 osd.20 up 1.00000 1.00000 22 hdd 3.63199 osd.22 up 1.00000 1.00000 25 hdd 3.63199 osd.25 up 1.00000 1.00000 27 hdd 3.63199 osd.27 up 1.00000 1.00000 29 hdd 3.63199 osd.29 up 1.00000 1.00000 32 hdd 3.63199 osd.32 up 1.00000 1.00000 34 hdd 3.63199 osd.34 up 1.00000 1.00000 36 hdd 3.63199 osd.36 up 1.00000 1.00000 39 hdd 3.63199 osd.39 up 1.00000 1.00000 41 hdd 3.63199 osd.41 up 1.00000 1.00000 43 hdd 3.63199 osd.43 up 1.00000 1.00000 48 hdd 3.63199 osd.48 up 1.00000 1.00000 50 hdd 3.63199 osd.50 up 1.00000 1.00000 52 hdd 3.63199 osd.52 up 1.00000 1.00000 55 hdd 3.63199 osd.55 up 1.00000 1.00000 62 hdd 3.63199 osd.62 up 1.00000 1.00000 65 hdd 3.63199 osd.65 up 1.00000 1.00000 66 hdd 3.63199 osd.66 up 1.00000 1.00000 67 hdd 3.63199 osd.67 up 1.00000 1.00000 70 hdd 3.63199 osd.70 up 1.00000 1.00000 72 hdd 3.63199 osd.72 up 1.00000 1.00000 74 hdd 3.63199 osd.74 up 1.00000 1.00000 76 hdd 3.63199 osd.76 up 1.00000 1.00000 79 hdd 3.63199 osd.79 up 1.00000 1.00000 92 hdd 3.63199 osd.92 up 1.00000 1.00000 94 hdd 3.63199 osd.94 up 1.00000 1.00000 97 hdd 3.63199 osd.97 up 1.00000 1.00000 99 hdd 3.63199 osd.99 up 1.00000 1.00000 101 hdd 3.63199 osd.101 up 1.00000 1.00000 104 hdd 3.63199 osd.104 up 1.00000 1.00000 107 hdd 3.63199 osd.107 up 1.00000 1.00000 111 hdd 3.63199 osd.111 up 1.00000 1.00000 112 hdd 3.63199 osd.112 up 1.00000 1.00000 114 hdd 3.63199 osd.114 up 1.00000 1.00000 117 hdd 3.63199 osd.117 up 1.00000 1.00000 119 hdd 3.63199 osd.119 up 1.00000 1.00000 131 hdd 3.63199 osd.131 up 1.00000 1.00000 137 hdd 3.63199 osd.137 up 1.00000 1.00000 139 hdd 3.63199 osd.139 up 1.00000 1.00000 -4 177.96744 host ceph-storage2 7 hdd 3.63199 osd.7 up 1.00000 1.00000 10 hdd 3.63199 osd.10 up 1.00000 1.00000 12 hdd 3.63199 osd.12 up 1.00000 1.00000 14 hdd 3.63199 osd.14 up 1.00000 1.00000 16 hdd 3.63199 osd.16 up 1.00000 1.00000 19 hdd 3.63199 osd.19 up 1.00000 1.00000 21 hdd 3.63199 osd.21 up 1.00000 1.00000 23 hdd 3.63199 osd.23 up 1.00000 1.00000 26 hdd 3.63199 osd.26 up 1.00000 1.00000 28 hdd 3.63199 osd.28 up 1.00000 1.00000 30 hdd 3.63199 osd.30 up 1.00000 1.00000 33 hdd 3.63199 osd.33 up 1.00000 1.00000 37 hdd 3.63199 osd.37 up 1.00000 1.00000 40 hdd 3.63199 osd.40 up 1.00000 1.00000 42 hdd 3.63199 osd.42 up 1.00000 1.00000 44 hdd 3.63199 osd.44 up 1.00000 1.00000 47 hdd 3.63199 osd.47 up 1.00000 1.00000 49 hdd 3.63199 osd.49 up 1.00000 1.00000 51 hdd 3.63199 osd.51 up 1.00000 1.00000 54 hdd 3.63199 osd.54 up 1.00000 1.00000 56 hdd 3.63199 osd.56 up 1.00000 1.00000 57 hdd 3.63199 osd.57 up 1.00000 1.00000 59 hdd 3.63199 osd.59 up 1.00000 1.00000 61 hdd 3.63199 osd.61 up 1.00000 1.00000 63 hdd 3.63199 osd.63 up 1.00000 1.00000 71 hdd 3.63199 osd.71 up 1.00000 1.00000 73 hdd 3.63199 osd.73 up 1.00000 1.00000 75 hdd 3.63199 osd.75 up 1.00000 1.00000 78 hdd 3.63199 osd.78 up 1.00000 1.00000 80 hdd 3.63199 osd.80 up 1.00000 1.00000 81 hdd 3.63199 osd.81 up 1.00000 1.00000 83 hdd 3.63199 osd.83 up 1.00000 1.00000 84 hdd 3.63199 osd.84 up 1.00000 1.00000 90 hdd 3.63199 osd.90 up 1.00000 1.00000 91 hdd 3.63199 osd.91 up 1.00000 1.00000 93 hdd 3.63199 osd.93 up 1.00000 1.00000 96 hdd 3.63199 osd.96 up 1.00000 1.00000 98 hdd 3.63199 osd.98 up 1.00000 1.00000 100 hdd 3.63199 osd.100 up 1.00000 1.00000 102 hdd 3.63199 osd.102 up 1.00000 1.00000 105 hdd 3.63199 osd.105 up 1.00000 1.00000 106 hdd 3.63199 osd.106 up 1.00000 1.00000 108 hdd 3.63199 osd.108 up 1.00000 1.00000 110 hdd 3.63199 osd.110 up 1.00000 1.00000 115 hdd 3.63199 osd.115 up 1.00000 1.00000 116 hdd 3.63199 osd.116 up 1.00000 1.00000 121 hdd 3.63199 osd.121 up 1.00000 1.00000 123 hdd 3.63199 osd.123 up 1.00000 1.00000 132 hdd 3.63199 osd.132 up 1.00000 1.00000 -3 196.14078 host ceph-storage3 2 hdd 3.63199 osd.2 up 1.00000 1.00000 9 hdd 3.63199 osd.9 up 1.00000 1.00000 17 hdd 3.63199 osd.17 up 1.00000 1.00000 24 hdd 3.63199 osd.24 up 1.00000 1.00000 31 hdd 3.63199 osd.31 up 1.00000 1.00000 38 hdd 3.63199 osd.38 up 1.00000 1.00000 45 hdd 3.63199 osd.45 up 1.00000 1.00000 53 hdd 3.63199 osd.53 up 1.00000 1.00000 58 hdd 3.63199 osd.58 up 1.00000 1.00000 64 hdd 3.63199 osd.64 up 1.00000 1.00000 69 hdd 3.63199 osd.69 up 1.00000 1.00000 77 hdd 3.63199 osd.77 up 1.00000 1.00000 82 hdd 3.63199 osd.82 up 1.00000 1.00000 86 hdd 3.63199 osd.86 up 1.00000 1.00000 88 hdd 3.63199 osd.88 up 1.00000 1.00000 95 hdd 3.63199 osd.95 up 1.00000 1.00000 103 hdd 3.63199 osd.103 up 1.00000 1.00000 109 hdd 3.63199 osd.109 up 1.00000 1.00000 113 hdd 3.63199 osd.113 up 1.00000 1.00000 120 hdd 3.63199 osd.120 up 1.00000 1.00000 127 hdd 3.63199 osd.127 up 1.00000 1.00000 134 hdd 3.63199 osd.134 up 1.00000 1.00000 140 hdd 3.63869 osd.140 up 1.00000 1.00000 141 hdd 3.63199 osd.141 up 1.00000 1.00000 143 hdd 3.63199 osd.143 up 1.00000 1.00000 144 hdd 3.63199 osd.144 up 1.00000 1.00000 145 hdd 3.63199 osd.145 up 1.00000 1.00000 146 hdd 3.63199 osd.146 up 1.00000 1.00000 147 hdd 3.63199 osd.147 up 1.00000 1.00000 148 hdd 3.63199 osd.148 up 1.00000 1.00000 149 hdd 3.63199 osd.149 up 1.00000 1.00000 150 hdd 3.63199 osd.150 up 1.00000 1.00000 151 hdd 3.63199 osd.151 up 1.00000 1.00000 152 hdd 3.63199 osd.152 up 1.00000 1.00000 153 hdd 3.63199 osd.153 up 1.00000 1.00000 154 hdd 3.63199 osd.154 up 1.00000 1.00000 155 hdd 3.63199 osd.155 up 1.00000 1.00000 156 hdd 3.63199 osd.156 up 1.00000 1.00000 157 hdd 3.63199 osd.157 up 1.00000 1.00000 158 hdd 3.63199 osd.158 up 1.00000 1.00000 159 hdd 3.63199 osd.159 up 1.00000 1.00000 161 hdd 3.63199 osd.161 up 1.00000 1.00000 162 hdd 3.63199 osd.162 up 1.00000 1.00000 164 hdd 3.63199 osd.164 up 1.00000 1.00000 165 hdd 3.63199 osd.165 up 1.00000 1.00000 167 hdd 3.63199 osd.167 up 1.00000 1.00000 168 hdd 3.63199 osd.168 up 1.00000 1.00000 169 hdd 3.63199 osd.169 up 1.00000 1.00000 170 hdd 3.63199 osd.170 up 1.00000 1.00000 171 hdd 3.63199 osd.171 up 1.00000 1.00000 172 hdd 3.63199 osd.172 up 1.00000 1.00000 173 hdd 3.63199 osd.173 up 1.00000 1.00000 174 hdd 3.63869 osd.174 up 1.00000 1.00000 177 hdd 3.63199 osd.177 up 1.00000 1.00000 # Ceph configuration shared by all nodes [global] fsid = 6a2e8f21-bca2-492b-8869-eecc995216cc public_network = 10.141.0.0/16 cluster_network = 10.85.8.0/22 mon_initial_members = ceph-p-mon1, ceph-p-mon2, ceph-p-mon3 mon_host = 10.141.161.248,10.141.160.250,10.141.167.237 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx # Cephfs needs these to be set to support larger directories mds_bal_frag = true allow_dirfrags = true rbd_default_format = 2 mds_beacon_grace = 60 mds session timeout = 120 log to syslog = true err to syslog = true clog to syslog = true [mds] [osd] osd op threads = 32 osd max backfills = 32 # Old method of moving ssds to a pool [osd.85] host = ceph-storage1 crush_location = root=ssds host=ceph-storage1-ssd [osd.89] host = ceph-storage1 crush_location = root=ssds host=ceph-storage1-ssd [osd.160] host = ceph-storage3 crush_location = root=ssds host=ceph-storage3-ssd [osd.163] host = ceph-storage3 crush_location = root=ssds host=ceph-storage3-ssd [osd.166] host = ceph-storage3 crush_location = root=ssds host=ceph-storage3-ssd [osd.5] host = ceph-storage2 crush_location = root=ssds host=ceph-storage2-ssd [osd.68] host = ceph-storage2 crush_location = root=ssds host=ceph-storage2-ssd [osd.87] host = ceph-storage2 crush_location = root=ssds host=ceph-storage2-ssd
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com