On Mon, Oct 8, 2018 at 9:46 PM Alfredo Daniel Rezinovsky <alfrenovsky@xxxxxxxxx> wrote: > > > > On 08/10/18 10:20, Yan, Zheng wrote: > > On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky > > <alfrenovsky@xxxxxxxxx> wrote: > >> > >> > >> On 08/10/18 09:45, Yan, Zheng wrote: > >>> On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky > >>> <alfrenovsky@xxxxxxxxx> wrote: > >>>> On 08/10/18 07:06, Yan, Zheng wrote: > >>>>> On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin <hell@xxxxxxxxxxx> wrote: > >>>>>>> On 8.10.2018, at 12:37, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>>>>>> > >>>>>>> On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin <hell@xxxxxxxxxxx> wrote: > >>>>>>>> What additional steps need to be taken in order to (try to) regain access to the fs providing that I backed up metadata pool, created alternate metadata pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive scrub. > >>>>>>>> After that I only mounted the fs read-only to backup the data. > >>>>>>>> Would anything even work if I had mds journal and purge queue truncated? > >>>>>>>> > >>>>>>> did you backed up whole metadata pool? did you make any modification > >>>>>>> to the original metadata pool? If you did, what modifications? > >>>>>> I backed up both journal and purge queue and used cephfs-journal-tool to recover dentries, then reset journal and purge queue on original metadata pool. > >>>>> You can try restoring original journal and purge queue, then downgrade > >>>>> mds to 13.2.1. Journal objects names are 20x.xxxxxxxx, purge queue > >>>>> objects names are 50x.xxxxxxxxx. > >>>> I'm already done a scan_extents and doing a scan_inodes, Do i need to > >>>> finish with the scan_links? > >>>> > >>>> I'm with 13.2.2. DO I finish the scan_links and then downgrade? > >>>> > >>>> I have a backup done with "cephfs-journal-tool journal export > >>>> backup.bin". I think I don't have the pugue queue > >>>> > >>>> can I reset the purgue-queue journal?, Can I import an empty file > >>>> > >>> It's better to restore journal to original metadata pool and reset > >>> purge queue to empty, then try starting mds. Reset the purge queue > >>> will leave some objects in orphan states. But we can handle them > >>> later. > >>> > >>> Regards > >>> Yan, Zheng > >> Let's see... > >> > >> "cephfs-journal-tool journal import backup.bin" will restore the whole > >> metadata ? > >> That's what "journal" means? > >> > > It just restores the journal. If you only reset original fs' journal > > and purge queue (run scan_foo commands with alternate metadata pool). > > It's highly likely restoring the journal will bring your fs back. > > > > > > > >> So I can stopt cephfs-data-scan, run the import, downgrade, and then > >> reset the purge queue? > >> > > you said you have already run scan_extents and scan_inodes. what > > cephfs-data-scan command is running? > Already ran (without alternate metadata) > > time cephfs-data-scan scan_extents cephfs_data # 10 hours > > time cephfs-data-scan scan_inodes cephfs_data # running 3 hours > with a warning: > 7fddd8f64ec0 -1 datascan.inject_with_backtrace: Dentry > 0x0x10000db852b/dovecot.index already exists but points to 0x0x1000134f97f > > Still not run: > > time cephfs-data-scan scan_links > you have modified metatdata pool. I suggest you to run scan_links. After it finishes, reset session table and try restarting mds. If mds start successfully, run 'ceph daemon mds.x scrub_path / recursive repair'. (don't let client mount before it finishes) Good luck > > > After 'import original journal'. run 'ceph mds repaired > > fs_name:damaged_rank', then try restarting mds. Check if mds can > > start. > > > >> Please remember me the commands: > >> I've been 3 days without sleep, and I don't wanna to broke it more. > >> > > sorry for that. > I updated on friday, broke a golden rule: "READ ONLY FRIDAY". My fault. > >> Thanks > >> > >> > >> > >>>> What do I do with the journals? > >>>> > >>>>>> Before proceeding to alternate metadata pool recovery I was able to start MDS but it soon failed throwing lots of 'loaded dup inode' errors, not sure if that involved changing anything in the pool. > >>>>>> I have left the original metadata pool untouched sine then. > >>>>>> > >>>>>> > >>>>>>> Yan, Zheng > >>>>>>> > >>>>>>>>> On 8.10.2018, at 05:15, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > >>>>>>>>> > >>>>>>>>> Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and > >>>>>>>>> marking mds repaird can resolve this. > >>>>>>>>> > >>>>>>>>> Yan, Zheng > >>>>>>>>> On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin <hell@xxxxxxxxxxx> wrote: > >>>>>>>>>> Update: > >>>>>>>>>> I discovered http://tracker.ceph.com/issues/24236 and https://github.com/ceph/ceph/pull/22146 > >>>>>>>>>> Make sure that it is not relevant in your case before proceeding to operations that modify on-disk data. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 6.10.2018, at 03:17, Sergey Malinin <hell@xxxxxxxxxxx> wrote: > >>>>>>>>>> > >>>>>>>>>> I ended up rescanning the entire fs using alternate metadata pool approach as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ > >>>>>>>>>> The process has not competed yet because during the recovery our cluster encountered another problem with OSDs that I got fixed yesterday (thanks to Igor Fedotov @ SUSE). > >>>>>>>>>> The first stage (scan_extents) completed in 84 hours (120M objects in data pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted by OSDs failure so I have no timing stats but it seems to be runing 2-3 times faster than extents scan. > >>>>>>>>>> As to root cause -- in my case I recall that during upgrade I had forgotten to restart 3 OSDs, one of which was holding metadata pool contents, before restarting MDS daemons and that seemed to had an impact on MDS journal corruption, because when I restarted those OSDs, MDS was able to start up but soon failed throwing lots of 'loaded dup inode' errors. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky <alfrenovsky@xxxxxxxxx> wrote: > >>>>>>>>>> > >>>>>>>>>> Same problem... > >>>>>>>>>> > >>>>>>>>>> # cephfs-journal-tool --journal=purge_queue journal inspect > >>>>>>>>>> 2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.0000016c > >>>>>>>>>> Overall journal integrity: DAMAGED > >>>>>>>>>> Objects missing: > >>>>>>>>>> 0x16c > >>>>>>>>>> Corrupt regions: > >>>>>>>>>> 0x5b000000-ffffffffffffffff > >>>>>>>>>> > >>>>>>>>>> Just after upgrade to 13.2.2 > >>>>>>>>>> > >>>>>>>>>> Did you fixed it? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 26/09/18 13:05, Sergey Malinin wrote: > >>>>>>>>>> > >>>>>>>>>> Hello, > >>>>>>>>>> Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2. > >>>>>>>>>> After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are damaged. Resetting purge_queue does not seem to work well as journal still appears to be damaged. > >>>>>>>>>> Can anybody help? > >>>>>>>>>> > >>>>>>>>>> mds log: > >>>>>>>>>> > >>>>>>>>>> -789> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.mds2 Updating MDS map to version 586 from mon.2 > >>>>>>>>>> -788> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 handle_mds_map i am now mds.0.583 > >>>>>>>>>> -787> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 handle_mds_map state change up:rejoin --> up:active > >>>>>>>>>> -786> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 recovery_done -- successful recovery! > >>>>>>>>>> <skip> > >>>>>>>>>> -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume: Decode error at read_pos=0x322ec6636 > >>>>>>>>>> -37> 2018-09-26 18:42:32.707 7f70f28a7700 5 mds.beacon.mds2 set_want_state: up:active -> down:damaged > >>>>>>>>>> -36> 2018-09-26 18:42:32.707 7f70f28a7700 5 mds.beacon.mds2 _send down:damaged seq 137 > >>>>>>>>>> -35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: _send_mon_message to mon.ceph3 at mon:6789/0 > >>>>>>>>>> -34> 2018-09-26 18:42:32.707 7f70f28a7700 1 -- mds:6800/e4cc09cf --> mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 -- 0x563b321ad480 con 0 > >>>>>>>>>> <skip> > >>>>>>>>>> -3> 2018-09-26 18:42:32.743 7f70f98b5700 5 -- mds:6800/3838577103 >> mon:6789/0 conn(0x563b3213e000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx mon.2 seq 29 0x563b321ab880 mdsbeaco > >>>>>>>>>> n(85106/mds2 down:damaged seq 311 v587) v7 > >>>>>>>>>> -2> 2018-09-26 18:42:32.743 7f70f98b5700 1 -- mds:6800/3838577103 <== mon.2 mon:6789/0 29 ==== mdsbeacon(85106/mds2 down:damaged seq 311 v587) v7 ==== 129+0+0 (3296573291 0 0) 0x563b321ab880 con 0x563b3213e > >>>>>>>>>> 000 > >>>>>>>>>> -1> 2018-09-26 18:42:32.743 7f70f98b5700 5 mds.beacon.mds2 handle_mds_beacon down:damaged seq 311 rtt 0.038261 > >>>>>>>>>> 0> 2018-09-26 18:42:32.743 7f70f28a7700 1 mds.mds2 respawn! > >>>>>>>>>> > >>>>>>>>>> # cephfs-journal-tool --journal=purge_queue journal inspect > >>>>>>>>>> Overall journal integrity: DAMAGED > >>>>>>>>>> Corrupt regions: > >>>>>>>>>> 0x322ec65d9-ffffffffffffffff > >>>>>>>>>> > >>>>>>>>>> # cephfs-journal-tool --journal=purge_queue journal reset > >>>>>>>>>> old journal was 13470819801~8463 > >>>>>>>>>> new journal start will be 13472104448 (1276184 bytes past old end) > >>>>>>>>>> writing journal head > >>>>>>>>>> done > >>>>>>>>>> > >>>>>>>>>> # cephfs-journal-tool --journal=purge_queue journal inspect > >>>>>>>>>> 2018-09-26 19:00:52.848 7f3f9fa50bc0 -1 Missing object 500.00000c8c > >>>>>>>>>> Overall journal integrity: DAMAGED > >>>>>>>>>> Objects missing: > >>>>>>>>>> 0xc8c > >>>>>>>>>> Corrupt regions: > >>>>>>>>>> 0x323000000-ffffffffffffffff > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> ceph-users mailing list > >>>>>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> ceph-users mailing list > >>>>>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>> _______________________________________________ > >>>>> ceph-users mailing list > >>>>> ceph-users@xxxxxxxxxxxxxx > >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com