If we're thinking about a laptop user - we probably do want to try and have a config that works well out of the box. Just like vstart :-) Wouldn't setting some limits allow us to better determine if indeed we're leaking memory? On Thu, Nov 29, 2018 at 12:52 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > I think this was discussed some on irc but I wasn't online at the > time, and to keep some updates on the list... > > On Thu, Nov 29, 2018 at 6:59 AM Erwan Velu <evelu@xxxxxxxxxx> wrote: > > > > > > > Hi Erwan, > > > > > > > > > Out of curiosity did you look at the mempool stats at all? It's > > > pretty likely you'll run out of memory with 512MB given our current > > > defaults and the memory autotuner won't be able to keep up (it will do > > > it's best, but can't work miracles). > > > > > > As per the ceph-nano project, the cluster is very simple with the > > following configuration : > > > > cluster: > > id: b2eafdd3-ec87-4107-afaf-521980bb3d9e > > health: HEALTH_OK > > > > services: > > mon: 1 daemons, quorum ceph-nano-travis-faa32aebf00b (age 2m) > > mgr: ceph-nano-travis-faa32aebf00b(active, since 2m) > > osd: 1 osds: 1 up (since 2m), 1 in (since 2m) > > rgw: 1 daemon active > > > > data: > > pools: 5 pools, 40 pgs > > objects: 174 objects, 1.6 KiB > > usage: 1.0 GiB used, 9.0 GiB / 10 GiB avail > > pgs: 40 active+clean > > > > > > I'll save the mempool stats over time to see what is growing on the idle > > case. > > > > > > > > [...] > > > > > In any event, when I've tested OSDs with that little memory there's > > > been fairly dramatic performance impacts in a variety of ways > > > depending on what you change. In practice the minimum amount of > > > memory we can reasonable work with right now is probably around > > > 1.5-2GB, and we do a lot better with 3-4GB+. > > In the ceph-nano context, we don't really target performance. > > The issue here is that ceph-nano doesn't target performance, but Ceph, > while we're trying to add more auto-tuning, really doesn't do a good > job of that itself. So since you haven't made any attempt to configure > the Ceph daemons to use less memory, they're going to run the way they > always do, assuming something like 2GB+ per OSD, 512+MB per monitor, > plus more for the manager (do we have numbers on that?) and rgw > (highly variable). > > And all the data you've shown us says "a newly-deployed cluster will > use more memory while it spins itself up to steady-state". If you run > it in a larger container, apply a workload to it, and can show that a > specific daemon is continuing to use more memory on an ongoing basis, > that might be concerning. But from what you've said so far, it's just > spinning things up. > Cutting down what Mark said a bit: > the OSD is going to keep using more memory for every write until it > hits its per-PG limits. (RGW is probably prompting some cluster writes > even while idle, which is polluting your "idle" data a bit, but that's > part of why you're seeing an active cluster use more.) > The OSD is going to do more internal bookkeeping with BlueStore until > it hits its limits. > The OSD is going to keep reporting PG stats to the monitor/manager > even if they don't change, and tracking those will use more memory on > the monitor and manager until they hit their limits. (I imagine that's > a big part of the "idle" cluster usage increase as well.) > -Greg