Hi Eric, On Tue, Dec 5, 2023 at 3:43 PM Eric Tittley <Eric.Tittley@xxxxxxxx> wrote: > > Hi Venky, > > > The recently crashed daemon is likely the MDS which you mentioned in > > your subsequent email. > > The "recently crashed daemon" was the osd.51 daemon which was in the > metadata pool. > > But yes, in the process of trying to get the system running, I probably > did a few steps that were unnecessary. The steps generally moved me in > the right direction until I got to MDS state "up:rejoin" where things > paused, then got much worse. > > Now I'm certainly in the phase of monkeying around trying desperately to > get a heartbeat out of the system and probably doing more damage than > good. If only I could ask the system "what are you actually trying to > do?" Scrolling through the source code doesn't help too much. My next > step will be to insert some useful debugging messages in the vicinity of > the error to extract more information. Failing on an assert() has > advantages, but also massive disadvantages when it comes to debugging. Those asserts have significant value to not let the system do funny things at a later point in time. As far your issue is concerned, is it possible to just throw away this fs and use a new one? > > Cheers, > Eric > > On 05/12/2023 06:10, Venky Shankar wrote: > > This email was sent to you by someone outside the University. > > You should only click on links or attachments if you are certain that the email is genuine and the content is safe. > > > > Hi Eric, > > > > On Mon, Nov 27, 2023 at 8:00 PM Eric Tittley <Eric.Tittley@xxxxxxxx> wrote: > >> Hi all, > >> > >> For about a week our CephFS has experienced issues with its MDS. > >> > >> Currently the MDS is stuck in "up:rejoin" > >> > >> Issues become apparent when simple commands like "mv foo bar/" hung. > > I assume the MDS was active at this point in time where the command > > hung. Would that be correct? > > > >> I unmounted CephFS offline on the clients, evicted those remaining, and then issued > >> > >> ceph config set mds.0 mds_wipe_sessions true > >> ceph config set mds.1 mds_wipe_sessions true > >> > >> which allowed me to delete the hung requests. > > Most likely, the above steps weren't really required. The hung command > > is possibly a deadlock in the MDS (during rename). > > > >> I've lost the exact commands I used, but something like > >> rados -p cephfs_metadata ls | grep mds > >> rados rm -p cephfs_metadata mds0_openfiles.0 > >> > >> etc > >> > >> This allowed the MDS to get to "up:rejoin" where it has been stuck ever since which is getting on five days. > >> > >> # ceph mds stat > >> cephfs:1/1 {0=cephfs.ceph00.uvlkrw=up:rejoin} 2 up:standby > >> > >> > >> > >> root@ceph00:/var/log/ceph/a614303a-5eb5-11ed-b492-011f01e12c9a# ceph -s > >> cluster: > >> id: a614303a-5eb5-11ed-b492-011f01e12c9a > >> health: HEALTH_WARN > >> 1 filesystem is degraded > >> 1 pgs not deep-scrubbed in time > >> 2 pool(s) do not have an application enabled > >> 1 daemons have recently crashed > >> > >> services: > >> mon: 3 daemons, quorum ceph00,ceph01,ceph02 (age 57m) > >> mgr: ceph01.lvdgyr(active, since 2h), standbys: ceph00.gpwpgs > >> mds: 1/1 daemons up, 2 standby > >> osd: 91 osds: 90 up (since 78m), 90 in (since 112m) > >> > >> data: > >> volumes: 0/1 healthy, 1 recovering > >> pools: 5 pools, 1539 pgs > >> objects: 138.83M objects, 485 TiB > >> usage: 971 TiB used, 348 TiB / 1.3 PiB avail > >> pgs: 1527 active+clean > >> 12 active+clean+scrubbing+deep > >> > >> io: > >> client: 3.1 MiB/s rd, 3.16k op/s rd, 0 op/s wr > >> > >> > >> # ceph --version > >> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) > >> > >> > >> I've tried failing the MDS so it switches. Rebooted a couple of times. > >> I've added more OSDs to the metadata pool and took one out as I thought it might be a bad metadata OSD (The "recently crashed" daemon). > > This isn't really going to do any good btw. > > > > The recently crashed daemon is likely the MDS which you mentioned in > > your subsequent email. > > > >> The error logs are full of > >> (prefix to all are: > >> Nov 27 14:02:44 ceph00 bash[2145]: debug 2023-11-27T14:02:44.619+0000 7f74e845e700 1 -- [v2:192.168.1.128:6800/2157301677,v1:192.168.1.128:6801/2157301677] --> [v2:192.168.1.133:6896/4289132926,v1:192.168.1.133:6897/4289132926] > >> ) > >> > >> crc :-1 s=READY pgs=12 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).send_message enqueueing message m=0x559be00adc00 type=42 osd_op(mds.0.36244:8142873 3.ff 3:ff5b34d6:::1.00000000:head [getxattr parent in=6b] snapc 0=[] ondisk+read+known_if_redirected+full_force+supports_pool_eio e32465) v8 > >> crc :-1 s=READY pgs=12 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).write_message sending message m=0x559be00adc00 seq=8142643 osd_op(mds.0.36244:8142873 3.ff 3:ff5b34d6:::1.00000000:head [getxattr parent in=6b] snapc 0=[] ondisk+read+known_if_redirected+full_force+supports_pool_eio e32465) v8 > >> crc :-1 s=THROTTLE_DONE pgs=12 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_message got 154 + 0 + 30 byte message. envelope type=43 src osd.89 off 0 > >> crc :-1 s=READ_MESSAGE_COMPLETE pgs=12 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_message received message m=0x559be01f4480 seq=8142643 from=osd.89 type=43 osd_op_reply(8142873 1.00000000 [getxattr (30) out=30b] v0'0 uv560123 ondisk = 0) v8 > >> osd_op_reply(8142873 1.00000000 [getxattr (30) out=30b] v0'0 uv560123 ondisk = 0) v8 ==== 154+0+30 (crc 0 0 0) 0x559be01f4480 con 0x559be00ad800 > >> osd_op(unknown.0.36244:8142874 3.ff 3:ff5b34d6:::1.00000000:head [getxattr parent in=6b] snapc 0=[] ondisk+read+known_if_redirected+full_force+supports_pool_eio e32465) v8 -- 0x559be2caec00 con 0x559be00ad800 > >> > >> > >> > >> > >> Repeating multiple times a second (and filling /var) > >> Prior to taking one of the cephfs_metadata OSDs offline, these came from communications from ceph00 to the node hosting the suspected bad OSD. > >> Now they are between ceph00 and the host of the replacement metadata OSD. > >> > >> Does anyone have any suggestion on how to get the MDS to switch from "up:rejoin" to "up:active"? > >> > >> Is there any way to debug this, to determine what issue really is? I'm unable to interpret the debug log. > >> > >> Cheers, > >> Eric > >> > >> ________________________________________________________ > >> Dr Eric Tittley > >> Research Computing Officer www.roe.ac.uk/~ert<http://www.roe.ac.uk/~ert> > >> Institute for Astronomy Royal Observatory, Edinburgh > >> > >> > >> > >> > >> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336. > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > -- > > Cheers, > > Venky > > > -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx