A week or two back, I had some cases where cosd got killed by the OOM killer on my test box. Someone else was hogging memory with some other programs running on the same computer, so I thought that was the cause. Also, it didn't happen again after like the first two times, so I turned my attention to other things. Unfortunately SIGKILL, which the OOM killer sends, is impossible to handle. However, it would be nice if we could dump out a memory usage report when the usage rises above a certain (user-defined) point. Colin On Tue, Jan 4, 2011 at 1:58 PM, John Leach <john@xxxxxxxxxxxxxxx> wrote: > Hi, > > I've got a 3 node test cluster (3 mons, 3 osds) with about 24,000,000 > very small objects across 2400 pools (written directly with librados, > this isn't a ceph filesystem). > > The cosd processes have steadily grown in ram size and have finally > exhausted ram and are getting killed by the oom killer (the nodes have > 6gig RAM and no swap). > > When I start them back up they just very quickly increase in ram size > again and get killed. > > Is this expected? Do the osds require a certain amount of resident > memory relative to the data size (or perhaps number of objects)? > > Can you offer any guidance on planning for ram usage? > > I'm running ceph 0.24 on 64bit Ubuntu Lucid servers. In case it's > useful, I've just written these objects serially, no reading, no > rewrites, updates or snapshots. > > I've got some further questions/observations about disk usage with this > scenario but I'll start a separate thread about that. > > Thanks, > > John. > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html