mon load..

Sage Weil <sage@xxxxxxxxxxx> · Thu, 30 May 2013 19:50:48 -0700 (PDT)

I'm playing with Mark's cluster, where he is seeing high ceph-mon cpu 
utilization when he creates new big pools.  I'm able to fairly reliably 
reproduce a livelock where it is stuck checking is_readable on queued auth 
requests long enough that it times out on the election and has to start 
all over again.

I see two issues:

- The PaxosService stuff is pulling values directly out of leveldb, and 
that is slow in this case.  Not completely sure why (compaction in teh 
background?  who knows.)  But, it's also unnecessary.. except that there 
is currently not a notification of the PaxosService instances when the 
underlying data changes.  That most easily plugs into a fwe places in teh 
Paxos class and on startup, but the way the layering is structured it's 
not very clean.  Not sure what teh right way to fix this up is.. but I 
think we do want some sort of PaxosService::refresh() that tells us 
whenever things changed; it can be the one to call the child's 
pdate_from_paxos().

- We should be able to discard those auth messages (and others!) if the 
original connection they came from has disconnected.. which is normally 
will after the client disconnects after 3 seconds (by default).

There is also wip-mon-trim that will lower the trim periodicity (and 
compaction) for the paxos states; that ought to help some as well, but I 
haven't tried it yet...

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html