On Mon, Mar 9, 2015 at 8:42 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > Hi Sage, > > On Tue, Feb 10, 2015 at 2:51 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> On Mon, 9 Feb 2015, David McBride wrote: >>> On 09/02/15 15:31, Gregory Farnum wrote: >>> >>> > So, memory usage of an OSD is usually linear in the number of PGs it >>> > hosts. However, that memory can also grow based on at least one other >>> > thing: the number of OSD Maps required to go through peering. It >>> > *looks* to me like this is what you're running in to, not growth on >>> > the number of state machines. In particular, those past_intervals you >>> > mentioned. ;) >>> >>> Hi Greg, >>> >>> Right, that sounds entirely plausible, and is very helpful. >>> >>> In practice, that means I'll need to be careful to avoid this situation >>> occurring in production ? but given that's unlikely to occur except in the >>> case of non-trivial neglect, I don't think I need be particularly concerned. >>> >>> (Happily, I'm in the situation that my existing cluster is purely for testing >>> purposes; the data is expendable.) >>> >>> That said, for my own peace of mind, it would be valuable to have a procedure >>> that can be used to recover from this state, even if it's unlikely to occur in >>> practice. >> >> The best luck I've had recovering from situations is something like: >> >> - stop all osds >> - osd set nodown >> - osd set nobackfill >> - osd set noup >> - set map cache size smaller to reduce memory footprint. >> >> osd map cache size = 50 >> osd map max advance = 25 >> osd map share max epochs = 25 >> osd pg epoch persisted max stale = 25 It can cause extreme slowness if you get into a failure situation and your OSDs need to calculate past intervals across more maps than will fit in the cache. :( That said, this might be a good idea as long as you're conscious of needing to set it back if you get into trouble later on. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html