On Mon, 23 Sep 2019, Koebbe, Brian wrote: > Thanks Sage! > ceph osd dump: https://pastebin.com/raw/zLPz9DQg > > > ceph-monstore-tool /var/lib/ceph/mon/ceph-ufm03 dump-keys |grep osd_snap| cut -c-29 |uniq -c > 2 osd_snap / purged_snap_10_000 > 1 osd_snap / purged_snap_12_000 > 75 osd_snap / purged_snap_13_000 > 4 osd_snap / purged_snap_14_000 > 778106 osd_snap / purged_snap_15_000 > 861 osd_snap / purged_snap_1_0000 > 323 osd_snap / purged_snap_4_0000 > 88 osd_snap / purged_snap_5_0000 > 2 osd_snap / purged_snap_7_0000 > 2 osd_snap / purged_snap_8_0000 > 2 osd_snap / removed_epoch_10_0 > 1 osd_snap / removed_epoch_12_0 > 75 osd_snap / removed_epoch_13_0 > 4 osd_snap / removed_epoch_14_0 > 2316417 osd_snap / removed_epoch_15_0 > 970 osd_snap / removed_epoch_1_00 > 324 osd_snap / removed_epoch_4_00 > 89 osd_snap / removed_epoch_5_00 > 2 osd_snap / removed_epoch_7_00 > 2 osd_snap / removed_epoch_8_00 > 2 osd_snap / removed_snap_10_00 > 1 osd_snap / removed_snap_12_00 > 75 osd_snap / removed_snap_13_00 > 4 osd_snap / removed_snap_14_00 > 2720849 osd_snap / removed_snap_15_00 > 1161 osd_snap / removed_snap_1_000 > 379 osd_snap / removed_snap_4_000 > 89 osd_snap / removed_snap_5_000 > 2 osd_snap / removed_snap_7_000 > 2 osd_snap / removed_snap_8_000 Thanks! I opened a ticket at https://tracker.ceph.com/issues/42012. Can you do the aboe dump with the 'grep osd_snap' only and attach that to the bug? This code has reworekd/improved in master, but I'm not sure the behavior of keeping a full record of past span deletions was changed, so we may need to make further improvements for octopus. Thanks! sage > ________________________________ > From: Sage Weil <sage@xxxxxxxxxxxx> > Sent: Monday, September 23, 2019 9:41 AM > To: Koebbe, Brian <koebbe@xxxxxxxxx> > Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>; dev@xxxxxxx <dev@xxxxxxx> > Subject: Re: Seemingly unbounded osd_snap keys in monstore. Normal? Expected? > > Hi, > > On Mon, 23 Sep 2019, Koebbe, Brian wrote: > > Our cluster has a little over 100 RBDs. Each RBD is snapshotted with a typical "frequently", hourly, daily, monthly type of schedule. > > A while back a 4th monitor was temporarily added to the cluster that took hours to synchronize with the other 3. > > While trying to figure out why that addition took so long, we discovered that our monitors have what seems like a really large number of osd_snap keys: > > > > ceph-monstore-tool /var/lib/ceph/mon/xxxxxx dump-keys |awk '{print $1}'|uniq -c > > 153 auth > > 2 config > > 10 health > > 1441 logm > > 3 mdsmap > > 313 mgr > > 1 mgr_command_descs > > 3 mgr_metadata > > 163 mgrstat > > 1 mkfs > > 323 mon_config_key > > 1 mon_sync > > 6 monitor > > 1 monitor_store > > 32 monmap > > 120 osd_metadata > > 1 osd_pg_creating > > 5818618 osd_snap > > 41338 osdmap > > 754 paxos > > > > A few questions: > > > > Could this be the cause of the slow addition/synchronization? > > Probably! > > > Is what looks like an unbounded number of osd_snaps expected? > > Maybe. Can you send me the output of 'ceph osd dump'? Also, if you don't > mind doing the dump above and grepping out just the osd_snap keys, so I > can see what they look like and if they match the osd map contents? > > Thanks! > sage > > > > If trimming/compacting them would help, how would one do that? > > > > Thanks, > > Brian > > > > ________________________________ > > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. > > > > ________________________________ > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. >
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx