Re: cosd multi-second stalls cause "wrongly marked me down"

Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx> · Wed, 23 Feb 2011 11:12:35 -0800



On Wednesday, February 23, 2011 at 10:54 AM, Sage Weil wrote: 
> On Wed, 23 Feb 2011, Gregory Farnum wrote:
> > I have managed to get OSDs wrongly marking each other down during 
> > startup when they're peering large numbers of PGs/pools, as they 
> > disagree on who they need to be heartbeating (due to the slow handling 
> > of new osd maps and pg creates); if you're mostly seeing OSDs get 
> > incorrectly marked down during low epochs (your original email said 
> > epoch 7) this is probably what you're finding.
> 
> FWIW, this isn't supposed to happen either.. the implementation may be 
> broken somewhat. The idea is that once an OSD starts to expect a 
> heartbeat it should tell them so. And if an OSD is told that a future 
> epoch says it should send heartbeats to node foo, then it will do so, at 
> least until it processes that epoch.
Hmmm -- I don't think they're telling the other OSDs that they're heartbeat partners! At least I didn't see anything that would make that happen. They just start expecting pings, and in some cases they will start sending them because they notice they're a local replica too, but there's nothing in those messages like "you owe me pings as of epoch x".
Are there stubs you know of that I should look at in re-implementing this behavior?

> > We still have no idea what could be causing the stall *inside* of 
> > tick(), though. :/
> 
> You mean heartbeat(), right? Yep, still no clue... :(
> 
Well the 28-second stall is inside of tick() as it arms a timer for the next tick. Heartbeat is definitely failing but nobody's quite sure why, as I recall. 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html