Recovery from OSDs loses the mds and rgw keys they use to authenticate with cephx. You need to get those set up again by using the auth commands. I don’t have them handy but it is discussed in the mailing list archives. -Greg On Thu, Sep 15, 2022 at 3:28 PM Jorge Garcia <jgarcia@xxxxxxxxxxxx> wrote: > Yes, I tried restarting them and even rebooting the mds machine. No joy. > If I try to start ceph-mds by hand, it returns: > > 2022-09-15 15:21:39.848 7fc43dbd2700 -1 monclient(hunting): > handle_auth_bad_method server allowed_methods [2] but i only support [2] > failed to fetch mon config (--no-mon-config to skip) > > I found this information online, maybe something to try next: > > https://docs.ceph.com/en/quincy/cephfs/recover-fs-after-mon-store-loss/ > > But I think maybe the mds needs to be running before that? > > On 9/15/22 15:19, Wesley Dillingham wrote: > > Having the quorum / monitors back up may change the MDS and RGW's > > ability to start and stay running. Have you tried just restarting the > > MDS / RGW daemons again? > > > > Respectfully, > > > > *Wes Dillingham* > > wes@xxxxxxxxxxxxxxxxx > > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > > > > > On Thu, Sep 15, 2022 at 5:54 PM Jorge Garcia <jgarcia@xxxxxxxxxxxx> > wrote: > > > > OK, I'll try to give more details as I remember them. > > > > 1. There was a power outage and then power came back up. > > > > 2. When the systems came back up, I did a "ceph -s" and it never > > returned. Further investigation revealed that the ceph-mon > > processes had > > not started in any of the 3 monitors. I looked at the log files > > and it > > said something about: > > > > ceph_abort_msg("Bad table magic number: expected 9863518390377041911, > > found 30790637387776 in > > /var/lib/ceph/mon/ceph-gi-cprv-adm-01/store.db/2886524.sst") > > > > Looking at the internet, I found some suggestions about > > troubleshooting > > monitors in: > > > > > https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ > > > > I quickly determined that the monitors weren't running, so I found > > the > > section where it said "RECOVERY USING OSDS". The description made > > sense: > > > > "But what if all monitors fail at the same time? Since users are > > encouraged to deploy at least three (and preferably five) monitors > > in a > > Ceph cluster, the chance of simultaneous failure is rare. But > > unplanned > > power-downs in a data center with improperly configured disk/fs > > settings > > could fail the underlying file system, and hence kill all the > > monitors. > > In this case, we can recover the monitor store with the information > > stored in OSDs." > > > > So, I did the procedure described in that section, and then made sure > > the correct keys were in the keyring and restarted the processes. > > > > WELL, I WAS REDOING ALL THESE STEPS WHILE WRITING THIS MAIL > > MESSAGE, AND > > NOW THE MONITORS ARE BACK! I must have missed some step in the > > middle of > > my panic. > > > > # ceph -s > > > > cluster: > > id: aaaaaaaa-bbbb-cccc-dddd-ffffffffffff > > health: HEALTH_WARN > > mons are allowing insecure global_id reclaim > > > > services: > > mon: 3 daemons, quorum host-a, host-b, host-c (age 19m) > > mgr: host-b(active, since 19m), standbys: host-a, host-c > > osd: 164 osds: 164 up (since 16m), 164 in (since 8h) > > > > data: > > pools: 14 pools, 2992 pgs > > objects: 91.58M objects, 290 TiB > > usage: 437 TiB used, 1.2 PiB / 1.7 PiB avail > > pgs: 2985 active+clean > > 7 active+clean+scrubbing+deep > > > > Couple of missing or strange things: > > > > 1. Missing mds > > 2. Missing rgw > > 3. New warning showing up > > > > But overall, better than a couple hours ago. If anybody is still > > reading > > and has any suggestions about how to solve the 3 items above, that > > would > > be great! Otherwise, back to scanning the internet for ideas... > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx