Re: CephFS 16.2.10 problem

Dhairya Parmar <dparmar@xxxxxxxxxx> · Wed, 27 Nov 2024 19:33:36 +0530

Hi,

As far as your previous email is concerned, MDS could not find the session
for the client(s) from the sessionmap. This is a bit weird because normally
there would always be a session but it's fine since it's trying to close a
session which is already closed so it's just ignoring and moving ahead. Do
keep us posted about the cluster status.

On Tue, Nov 26, 2024 at 5:26 PM <Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx>
wrote:

>
>
> Good morning!
>
>
> It's a novelty for our situation. The list of cluster locks now contains
> entries for the addresses of the servers on which the mds is running. The
> lifetime of the records is a day. For now, we decided to wait until the
> records are deleted and see how the cluster behaves.
>
>
> Best regards!
>
>
> Alexey Tsivinsky
>
>
> e-mail:a.tsivinsky@xxxxxxxxxxxxxxxxxxxxx
>
> От: Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx <
> Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx>
> Отправлено: 25 ноября 2024 г. 16:48
> Кому: dparmar@xxxxxxxxxx
> Копия: Marc@xxxxxxxxxxxxxxxxx; ceph-users@xxxxxxx
> Тема:  Re: CephFS 16.2.10 problem
>
> We have a version in containers.
> Here is a fresh log from the first mds. Actually, this is the whole log,
> after it is restarted.
>
>
> debug 2024-11-25T11:15:13.405+0000 7f2b0c4eb900  0 set uid:gid to 167:167
> (ceph:ceph)
> debug 2024-11-25T11:15:13.405+0000 7f2b0c4eb900  0 ceph version 16.2.10
> (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), process
> ceph-mds, pid 2
> debug 2024-11-25T11:15:13.405+0000 7f2b0c4eb900  1 main not setting numa
> affinity
> debug 2024-11-25T11:15:13.405+0000 7f2b0c4eb900  0 pidfile_write: ignore
> empty --pid-file starting mds.cephfs.cmon1.weffbd at
> debug 2024-11-25T11:15:13.429+0000 7f2afa62c700  1 mds.cephfs.cmon1.weffbd
> Updating MDS map to version 660192 from mon.0
> debug 2024-11-25T11:15:13.989+0000 7f2afa62c700  1 mds.cephfs.cmon1.weffbd
> Updating MDS map to version 660193 from mon.0
> debug 2024-11-25T11:15:13.989+0000 7f2afa62c700  1 mds.cephfs.cmon1.weffbd
> Monitors have assigned me to become a standby.
> debug 2024-11-25T11:15:14.097+0000 7f2afa62c700  1 mds.cephfs.cmon1.weffbd
> Updating MDS map to version 660194 from mon.0
> debug 2024-11-25T11:15:14.097+0000 7f2afa62c700  1 mds.0.660194
> handle_mds_map i am now mds.0.660194
> debug 2024-11-25T11:15:14.097+0000 7f2afa62c700  1 mds.0.660194
> handle_mds_map state change up:boot --> up:replay
> debug 2024-11-25T11:15:14.097+0000 7f2afa62c700  1 mds.0.660194
> replay_start
> debug 2024-11-25T11:15:14.097+0000 7f2afa62c700  1 mds.0.660194  waiting
> for osdmap 123229 (which blocklists prior instance)
> debug 2024-11-25T11:33:01.227+0000 7f2afc630700  1 mds.cephfs.cmon1.weffbd
> asok_command: client ls {prefix=client ls} (starting...)
> debug 2024-11-25T11:38:25.004+0000 7f2afc630700  1 mds.cephfs.cmon1.weffbd
> asok_command: client ls {prefix=client ls} (starting...)
> debug 2024-11-25T11:47:01.855+0000 7f2afa62c700  1 mds.cephfs.cmon1.weffbd
> Updating MDS map to version 660196 from mon.0
> debug 2024-11-25T11:47:27.366+0000 7f2afc630700  1 mds.cephfs.cmon1.weffbd
> asok_command: client ls {prefix=client ls} (starting...)
> debug 2024-11-25T11:49:35.045+0000 7f2afc630700  1 mds.cephfs.cmon1.weffbd
> asok_command: client ls {prefix=client ls} (starting...)
> debug 2024-11-25T12:39:31.023+0000 7f2afc630700  1 mds.cephfs.cmon1.weffbd
> asok_command: client ls {prefix=client ls} (starting...)
>
>
> But the log from the second mds
>
>
> debug 2024-11-25T11:47:01.353+0000 7fe629b48900  0 set uid:gid to 167:167
> (ceph:ceph)
> debug 2024-11-25T11:47:01.353+0000 7fe629b48900  0 ceph version 16.2.10
> (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable), process
> ceph-mds, pid 2
> debug 2024-11-25T11:47:01.353+0000 7fe629b48900  1 main not setting numa
> affinity
> debug 2024-11-25T11:47:01.353+0000 7fe629b48900  0 pidfile_write: ignore
> empty --pid-file starting mds.cephfs.cmon3.isftcc at
> debug 2024-11-25T11:47:01.357+0000 7fe617c89700  1 mds.cephfs.cmon3.isftcc
> Updating MDS map to version 660194 from mon.2
> debug 2024-11-25T11:47:01.813+0000 7fe617c89700  1 mds.cephfs.cmon3.isftcc
> Updating MDS map to version 660195 from mon.2
> debug 2024-11-25T11:47:01.813+0000 7fe617c89700  1 mds.cephfs.cmon3.isftcc
> Monitors have assigned me to become a standby.
> debug 2024-11-25T11:47:01.869+0000 7fe617c89700  1 mds.cephfs.cmon3.isftcc
> Updating MDS map to version 660196 from mon.2
> debug 2024-11-25T11:47:01.869+0000 7fe617c89700  1 mds.1.660196
> handle_mds_map i am now mds.1.660196
> debug 2024-11-25T11:47:01.869+0000 7fe617c89700  1 mds.1.660196
> handle_mds_map state change up:boot --> up:replay
> debug 2024-11-25T11:47:01.869+0000 7fe617c89700  1 mds.1.660196
> replay_start
> debug 2024-11-25T11:47:01.869+0000 7fe617c89700  1 mds.1.660196  waiting
> for osdmap 123229 (which blocklists prior instance)
> debug 2024-11-25T11:47:01.885+0000 7fe611c7d700  0 mds.1.cache creating
> system inode with ino:0x101
> debug 2024-11-25T11:47:01.885+0000 7fe611c7d700  0 mds.1.cache creating
> system inode with ino:0x1
> debug 2024-11-25T11:47:01.933+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : EMetaBlob.replay sessionmap v 575666093 - 1 > table 0
> debug 2024-11-25T11:47:01.933+0000 7fe61047a700  1 mds.1.sessionmap wipe
> start
> debug 2024-11-25T11:47:01.933+0000 7fe61047a700  1 mds.1.sessionmap wipe
> result
> debug 2024-11-25T11:47:01.933+0000 7fe61047a700  1 mds.1.sessionmap wipe
> done
> debug 2024-11-25T11:47:02.009+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14061971 v1:
> 172.16.1.71:0/946524698 from time 2024-11-22T19:29:09.805698+0000,
> ignoring
> debug 2024-11-25T11:47:02.013+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.16357226 v1:
> 172.16.1.71:0/203827739 from time 2024-11-22T19:29:10.286233+0000,
> ignoring
> debug 2024-11-25T11:47:02.021+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14258572 v1:
> 172.16.1.71:0/1272592856 from time 2024-11-22T19:29:16.592609+0000,
> ignoring
> debug 2024-11-25T11:47:02.025+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.11324824
> 172.16.1.63:0/2149107031 from time 2024-11-22T19:29:19.606354+0000,
> ignoring
> debug 2024-11-25T11:47:02.033+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.16316987 v1:
> 172.16.1.72:0/1965357079 from time 2024-11-22T19:29:27.300632+0000,
> ignoring
> debug 2024-11-25T11:47:02.045+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14097954 v1:
> 172.16.1.72:0/476813996 from time 2024-11-22T19:29:30.228797+0000,
> ignoring
> debug 2024-11-25T11:47:02.061+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14003810 v1:
> 172.16.1.73:0/1596239356 from time 2024-11-22T19:29:46.319982+0000,
> ignoring
> debug 2024-11-25T11:47:02.061+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13551599 v1:
> 172.16.1.73:0/3004484684 from time 2024-11-22T19:29:46.759950+0000,
> ignoring
> debug 2024-11-25T11:47:02.065+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.11353464
> 172.16.1.53:0/3526774021 from time 2024-11-22T19:29:53.807769+0000,
> ignoring
> debug 2024-11-25T11:47:02.077+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14091432 v1:
> 172.16.1.56:0/118111754 from time 2024-11-22T19:30:03.736093+0000,
> ignoring
> debug 2024-11-25T11:47:02.101+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.15650904 v1:
> 172.16.1.74:0/906764012 from time 2024-11-22T19:30:08.025237+0000,
> ignoring
> debug 2024-11-25T11:47:02.105+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13516577 v1:
> 172.16.1.74:0/1816858230 from time 2024-11-22T19:30:10.976129+0000,
> ignoring
> debug 2024-11-25T11:47:02.117+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13602798 v1:
> 172.16.1.74:0/1991985682 from time 2024-11-22T19:30:15.279845+0000,
> ignoring
> debug 2024-11-25T11:47:02.121+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13631484 v1:
> 172.16.1.56:0/4041552059 from time 2024-11-22T19:30:18.903232+0000,
> ignoring
> debug 2024-11-25T11:47:02.125+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13516304 v1:
> 172.16.1.74:0/1938073649 from time 2024-11-22T19:30:22.873231+0000,
> ignoring
> debug 2024-11-25T11:47:02.129+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13638960
> 172.16.1.55:0/887596108 from time 2024-11-22T19:30:24.538249+0000,
> ignoring
> debug 2024-11-25T11:47:02.129+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.14097420 v1:
> 172.16.1.75:0/637969159 from time 2024-11-22T19:30:24.591981+0000,
> ignoring
> debug 2024-11-25T11:47:02.153+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.13654791
> 172.16.1.76:0/3475543361 from time 2024-11-22T19:30:44.930601+0000,
> ignoring
> debug 2024-11-25T11:47:02.157+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.15103382
> 172.16.1.76:0/2783362396 from time 2024-11-22T19:30:46.768393+0000,
> ignoring
> debug 2024-11-25T11:47:02.157+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.15718734
> 172.16.1.76:0/897474938 from time 2024-11-22T19:30:50.850188+0000,
> ignoring
> debug 2024-11-25T11:47:02.157+0000 7fe61047a700 -1 log_channel(cluster)
> log [ERR] : replayed stray Session close event for client.11288093
> 172.16.1.76:0/2876628524 from time 2024-11-22T19:30:54.081311+0000,
> ignoring
> debug 2024-11-25T11:49:10.106+0000 7fe619c8d700  1 mds.cephfs.cmon3.isftcc
> asok_command: client ls {prefix=client ls} (starting...)
> debug 2024-11-25T13:43:48.963+0000 7fe619c8d700  1 mds.cephfs.cmon3.isftcc
> asok_command: status {prefix=status} (starting...)
>
>
> Best Regards!
>
>
> Alexey Tsivinsky
>
>
> e-mail:a.tsivinsky@xxxxxxxxxxxxxxxxxxxxx<mailto:
> e-mail%3Aa.tsivinsky@xxxxxxxxxxxxxxxxxxxxx>
>
>
> ________________________________
> От: Dhairya Parmar <dparmar@xxxxxxxxxx>
> Отправлено: 25 ноября 2024 г. 16:07
> Кому: Цивинский Алексей Александрович
> Копия: Marc@xxxxxxxxxxxxxxxxx; ceph-users@xxxxxxx
> Тема: Re:  Re: CephFS 16.2.10 problem
>
>
>
>
> On Mon, Nov 25, 2024 at 3:33 PM <Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx
> <mailto:Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx>> wrote:
>
> Thanks for your answer!
>
>
> Current status of our cluster
>
>
> cluster:
>     id:     c3d33e01-dfcd-4b39-8614-993370672504
>     health: HEALTH_WARN
>             1 failed cephadm daemon(s)
>             1 filesystem is degraded
>
>   services:
>     mon: 3 daemons, quorum cmon1,cmon2,cmon3 (age 15h)
>     mgr: cmon3.ixtbep(active, since 19h), standbys: cmon1.efktsr
>     mds: 2/2 daemons up
>     osd: 168 osds: 168 up (since 2d), 168 in (since 3w)
>
>   data:
>     volumes: 0/1 healthy, 1 recovering
>     pools:   4 pools, 4641 pgs
>     objects: 181.91M objects, 235 TiB
>     usage:   708 TiB used, 290 TiB / 997 TiB avail
>     pgs:     4630 active+clean
>              11   active+clean+scrubbing+deep
>
> This doesn't reveal much. Can you share MDS logs?
>
>
>
> We are trying to do cephfs-journal-tool --rank cephfs: 0 journal inspect
> and the utility does nothing.
>
> If the ranks are unavailable, it won't do anything. Do you see any log
> statements like "Couldn't determine MDS rank."?
>
>
> We thought that mds were blocking their journals, and turned them off. But
> the utility does not work, and ceph -s says that one mds is running,
> although we checked that we stopped all processes.
> It turns out somewhere else there is a blocking of magazines.
> What else can be done? Do you want to restart the monitors?
>
>
> Best Regards!
>
>
> Alexey Tsivinsky
>
>
> e-mail:a.tsivinsky@xxxxxxxxxxxxxxxxxxxxx<mailto:
> e-mail%3Aa.tsivinsky@xxxxxxxxxxxxxxxxxxxxx>
>
>
>
> От: Dhairya Parmar <dparmar@xxxxxxxxxx<mailto:dparmar@xxxxxxxxxx>>
> Отправлено: 25 ноября 2024 г. 12:19
> Кому: Цивинский Алексей Александрович
> Копия: Marc@xxxxxxxxxxxxxxxxx<mailto:Marc@xxxxxxxxxxxxxxxxx>;
> ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
> Тема: Re:  Re: CephFS 16.2.10 problem
>
> Hi,
>
> The log you shared indicates that MDS is waiting for the latest OSDMap
> epoch. The epoch number in log line 123138 is the epoch of last failure.
> Any MDS entering replay state needs at least this osdmap epoch to ensure
> the blocklist propopates. If the epoch is less than this then it just goes
> back to waiting.
>
> I have limited knowledge about the OSDs but you had mentioned in your
> initial mail about executing some OSD commands, I'm not sure if the issue
> lies there. You can check and share OSD logs or maybe `ceph -s` could
> reveal some potential warnings.
>
>
> Dhairya Parmar
>
> Associate Software Engineer, CephFS
>
> IBM, Inc.
>
>
>
> On Mon, Nov 25, 2024 at 1:29 PM <Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx
> <mailto:Alexey.Tsivinsky@xxxxxxxxxxxxxxxxxxxx>> wrote:
> Good afternoon
>
> We tried to leave only one mds, stopped others, even deleted one, and
> turned off the requirement for stand-by mds. Nothing helped, mds remained
> in the status of replays.
> Current situation: we now have two active mds in the status of replays,
> and one in stand-by.
> At the same time, in the logs we see a message
> mds.0.660178  waiting for osdmap 123138 (which blocklists prior instance)
> At the same time, there is no activity on both mds.
> The launch of the cephfs-journal-tool journal inspect utility does not
> produce any results - the utility worked for 12 hours and did not produce
> anything, we stopped it.
>
> Maybe the problem is this blocking? How to remove it?
>
> Best regards!
>
> Alexey Tsivinsky
> e-mail: a.tsivinsky@xxxxxxxxxxxxxxxxxxxxx<mailto:
> a.tsivinsky@xxxxxxxxxxxxxxxxxxxxx>
> ________________________________________
> От: Marc <Marc@xxxxxxxxxxxxxxxxx<mailto:Marc@xxxxxxxxxxxxxxxxx>>
> Отправлено: 25 ноября 2024 г. 1:47
> Кому: Цивинский Алексей Александрович; ceph-users@xxxxxxx<mailto:
> ceph-users@xxxxxxx>
> Тема: RE: CephFS 16.2.10 problem
>
> >
> > The following problem occurred.
> > There is a cluster ceph 16.2.10
> > The cluster was operating normally on Friday. Shut down cluster:
> > -Excluded all clients
> > Executed commands:
> > ceph osd set noout
> > ceph osd set nobackfill
> > ceph osd set norecover
> > ceph osd set norebalance
> > ceph osd set nodown
> > ceph osd set pause
> > Turned off the cluster, checked server maintenance.
> > Enabled cluster. He gathered himself, found all the nodes, and here the
> > problem began. After all OSD went up and all pg became available, cephfs
> > refused to start.
> > Now mds are in the replay status, and do not go to the ready status.
> > Previously, one of them was in the replay (laggy) status, but we
> > executed command:  ceph config set mds mds_wipe_sessions true
> > After that, mds switched to the status of replays, the third in standby
> > status started, and mds crashes with an error stopped.
> > But cephfs is still unavailable.
> > What else can we do?
> > The cluster is very large, almost 200 million files.
> >
>
> I assume you tried to start just one mds and wait until it would come up
> as active (before starting the others)?
>
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:
> ceph-users-leave@xxxxxxx>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx