Hi, Yeah that is clearly showing a new osdmap epoch a few times per second. There's nothing in the ceph.audit.log ? You might need to increase the debug levels of the mon leader to see what is triggering it. -- dan On Mon, Nov 8, 2021 at 2:37 PM Manuel Lausch <manuel.lausch@xxxxxxxx> wrote: > > Hi Dan, > > thanks for the hint. > The cluster is not doing any changes (rebalance, merging, splitting, or > somethin like this). Only normal client traffic via librados. > > In the mon.log I see regularly the following messages, which seems to > corelate to the osd map "changes" > > 2021-11-08T14:15:58.915+0100 7f8bd32a3700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd32a3700' had timed out after 0.000000000s > 2021-11-08T14:15:58.953+0100 7f8bd3aa4700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd3aa4700' had timed out after 0.000000000s > 2021-11-08T14:15:59.201+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970041 e1970041: 125 total, 125 up, 125 in > 2021-11-08T14:15:59.242+0100 7f8bd4aa6700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd4aa6700' had timed out after 0.000000000s > 2021-11-08T14:15:59.480+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970042 e1970042: 125 total, 125 up, 125 in > 2021-11-08T14:15:59.484+0100 7f8bd32a3700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd32a3700' had timed out after 0.000000000s > 2021-11-08T14:15:59.520+0100 7f8bd42a5700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd42a5700' had timed out after 0.000000000s > 2021-11-08T14:15:59.757+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970043 e1970043: 125 total, 125 up, 125 in > 2021-11-08T14:15:59.797+0100 7f8bd3aa4700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd3aa4700' had timed out after 0.000000000s > 2021-11-08T14:16:00.047+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970044 e1970044: 125 total, 125 up, 125 in > 2021-11-08T14:16:00.051+0100 7f8bd4aa6700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd4aa6700' had timed out after 0.000000000s > 2021-11-08T14:16:00.087+0100 7f8bd32a3700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd32a3700' had timed out after 0.000000000s > 2021-11-08T14:16:00.329+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970045 e1970045: 125 total, 125 up, 125 in > 2021-11-08T14:16:00.369+0100 7f8bd4aa6700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd4aa6700' had timed out after 0.000000000s > 2021-11-08T14:16:00.635+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970046 e1970046: 125 total, 125 up, 125 in > 2021-11-08T14:16:00.640+0100 7f8bd32a3700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd32a3700' had timed out after 0.000000000s > 2021-11-08T14:16:00.674+0100 7f8bd3aa4700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd3aa4700' had timed out after 0.000000000s > 2021-11-08T14:16:00.930+0100 7f8bd2aa2700 1 mon.csdeveubs-u02c01mon03@2(peon).osd e1970047 e1970047: 125 total, 125 up, 125 in > 2021-11-08T14:16:00.968+0100 7f8bd32a3700 1 heartbeat_map reset_timeout 'Monitor::cpu_tp thread 0x7f8bd32a3700' had timed out after 0.000000000s > > > timeouts after 0.0 seconds? > In between this timeouts the osdmap epoch is increasing. This happens > in bursts. Between this bursts there is no new map epoch. > > > Manuel > > > On Mon, 8 Nov 2021 13:01:06 +0100 > Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > > Hi, > > > > Okay. Here is another case which was churning the osdmaps: > > https://tracker.ceph.com/issues/51433 > > Perhaps similar debugging will show what's creating the maps in your > > case. > > > > Cheers, Dan > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx