Hi Patrick, thanks for the reply On Fri, 2020-09-04 at 10:25 -0700, Patrick Donnelly wrote: > > We then started using the cephfs (we keep VM images on the cephfs). > > The > > MDS were showing an error. I restarted the MDS but they didn't come > > back.We then followed the instructions here: > > https://docs.ceph.com/docs/nautilus/cephfs/disaster-recovery-experts/#disaster-recovery-experts > > up to truncating the journal. The MDS started again. However, as > > soon > > as we started writing the cephfs the MDS crashed. A scrub of the > > cephfs > > revealed backtrace damage. > > > I'm confused why you started the disaster recovery procedure when the > > procedure you follow should result in no damage to the PGs (and > > subsequently CephFS). It'd be helpful to know what this original > error > > was. > > so, when we re-enabled the cephfs I was monitoring the cluster using ceph -w and I noticed lots of errors going past, something like 2020-09-03 09:30:24.711 7fd1d2932700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 8537805160800 subtree root 0x1 not in cache 2020-09-03 09:30:24.712 7fd1d2932700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} 2020-09-03 09:30:24.712 7fd1d2932700 0 mds.0.journal journal ambig_subtrees: 2020-09-03 09:30:24.712 7fd1d2932700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 8537805208638 subtree root 0x1 not in cache 2020-09-03 09:30:24.712 7fd1d2932700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} 2020-09-03 09:30:24.712 7fd1d2932700 0 mds.0.journal journal ambig_subtrees: 2020-09-03 09:30:24.714 7fd1d2932700 0 mds.0.journal EMetaBlob.replay missing dir ino 0x1000003857d 2020-09-03 09:30:24.714 7fd1d2932700 -1 log_channel(cluster) log [ERR] : failure replaying journal (EMetaBlob) 2020-09-03 09:30:24.714 7fd1d2932700 1 mds.store07 respawn! I, perhaps foolishly, restarted mds daemons. Eventually the last one didn't come back and the cephfs was in error. I am not quite sure what we tried at this stage. I think we started the cephfs scrub which found some backtrace errors. However, again perhaps foolishly, we started using cephfs during the scrub process and MDS crashed when the clients started writing to the cephfs. At this stage should we have waited for the scrub to complete before allowing the clients to write to the filesystem? At that stage we started the recovery procedure. > > Backtrace damage is usually resolved with a scrub. > > this is not clear from the documentation. > > > We have now followed the remaining steps of the disaster recovery > > procedure and are waiting for the cephfs-data-scan scan_extents to > > complete. > > It would be really helpful if you could give an indication of how > > long > > this process will take (we have ~40TB in our cephfs) and how many > > workers to use. > > > I don't have any recent data on how long it could take but you might > > try using at least 8 workers. > > We are using 4 workers and the first stage hasn't completed yet. Is it safe to interrupt and restart the procedure with more workers? Can the workers be run on different machines? > > > The other missing bit of documentation is the cephfs scrubbing. Is > > that > > something we should run routinely? > > > CephFS scrubbing is usually done when something goes wrong or backing > > metadata needs updated for some reason as part of an upgrade (e.g. > > Mimic and snapshot formats). It's not considered necessary to do it > on > > a routine basis. RADOS PG scrubbing is sufficient for ensuring that > > the backing data is routinely checked for correctness/redundancy. ok, that's very helpful information. Does the cephfs need to be in a particular state for the scrub to be run? Perhaps us restarting the cephfs uncovered an earlier error: 2020-08-31 12:54:45.976 7f10fe790700 0 mds.2.journal EMetaBlob.replay missing dir ino 0x10002024c23 2020-08-31 12:54:45.979 7f10fe790700 -1 log_channel(cluster) log [ERR] : failure replaying journal (EMetaBlob) 2020-08-31 12:54:45.979 7f10fe790700 1 mds.store06 respawn! which we hadn't appreciated. would a scrub have resolved that? Thanks a lot for your replies. Regards magnus The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx