Hi, I'm also seeing slow memory increase over time with my bluestore nvme osds (3,2tb each) , with default ceph.conf settings. (ceph 12.2.2) each osd start around 5G memory, and go up to 8GB. Currently I'm restarting them around each month to free memory. here a dump of osd.0 after 1week running ceph 2894538 3.9 9.9 7358564 6553080 ? Ssl mars01 303:03 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph root@ceph4-1:~# ceph daemon osd.0 dump_mempools { "bloom_filter": { "items": 0, "bytes": 0 }, "bluestore_alloc": { "items": 84070208, "bytes": 84070208 }, "bluestore_cache_data": { "items": 168, "bytes": 2908160 }, "bluestore_cache_onode": { "items": 947820, "bytes": 636935040 }, "bluestore_cache_other": { "items": 101250372, "bytes": 2043476720 }, "bluestore_fsck": { "items": 0, "bytes": 0 }, "bluestore_txc": { "items": 8, "bytes": 5760 }, "bluestore_writing_deferred": { "items": 85, "bytes": 1203200 }, "bluestore_writing": { "items": 7, "bytes": 569584 }, "bluefs": { "items": 1774, "bytes": 106360 }, "buffer_anon": { "items": 68307, "bytes": 17188636 }, "buffer_meta": { "items": 284, "bytes": 24992 }, "osd": { "items": 333, "bytes": 4017312 }, "osd_mapbl": { "items": 0, "bytes": 0 }, "osd_pglog": { "items": 1195884, "bytes": 298139520 }, "osdmap": { "items": 4542, "bytes": 384464 }, "osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": { "items": 0, "bytes": 0 }, "mds_co": { "items": 0, "bytes": 0 }, "unittest_1": { "items": 0, "bytes": 0 }, "unittest_2": { "items": 0, "bytes": 0 }, "total": { "items": 187539792, "bytes": 3089029956 } } another osd after 1 month: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ceph 1718009 2.5 11.7 8542012 7725992 ? Ssl 2017 2463:28 /usr/bin/ceph-osd -f --cluster ceph --id 5 --setuser ceph --setgroup ceph root@ceph4-1:~# ceph daemon osd.5 dump_mempools { "bloom_filter": { "items": 0, "bytes": 0 }, "bluestore_alloc": { "items": 98449088, "bytes": 98449088 }, "bluestore_cache_data": { "items": 759, "bytes": 17276928 }, "bluestore_cache_onode": { "items": 884140, "bytes": 594142080 }, "bluestore_cache_other": { "items": 116375567, "bytes": 2072801299 }, "bluestore_fsck": { "items": 0, "bytes": 0 }, "bluestore_txc": { "items": 6, "bytes": 4320 }, "bluestore_writing_deferred": { "items": 99, "bytes": 1190045 }, "bluestore_writing": { "items": 11, "bytes": 4510159 }, "bluefs": { "items": 1202, "bytes": 64136 }, "buffer_anon": { "items": 76863, "bytes": 21327234 }, "buffer_meta": { "items": 910, "bytes": 80080 }, "osd": { "items": 328, "bytes": 3956992 }, "osd_mapbl": { "items": 0, "bytes": 0 }, "osd_pglog": { "items": 1118050, "bytes": 286277600 }, "osdmap": { "items": 6073, "bytes": 551872 }, "osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": { "items": 0, "bytes": 0 }, "mds_co": { "items": 0, "bytes": 0 }, "unittest_1": { "items": 0, "bytes": 0 }, "unittest_2": { "items": 0, "bytes": 0 }, "total": { "items": 216913096, "bytes": 3100631833 } } ----- Mail original ----- De: "Kjetil Joergensen" <kjetil@xxxxxxxxxxxx> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Mercredi 7 Mars 2018 01:07:06 Objet: Re: Memory leak in Ceph OSD? Hi, addendum: We're running 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b). The workload is a mix of 3xreplicated & ec-coded (rbd, cephfs, rgw). -KJ On Tue, Mar 6, 2018 at 3:53 PM, Kjetil Joergensen < [ mailto:kjetil@xxxxxxxxxxxx | kjetil@xxxxxxxxxxxx ] > wrote: Hi, so.. +1 We don't run compression as far as I know, so that wouldn't be it. We do actually run a mix of bluestore & filestore - due to the rest of the cluster predating a stable bluestore by some amount. The interesting part is - the behavior seems to be specific to our bluestore nodes. Below - yellow line, node with 10 x ~4TB SSDs, green line 8 x 800GB SSDs. Blue line - dump_mempools total bytes for all the OSDs running on the yellow line. The big dips - forced restarts after having suffered through after effects of letting linux deal with it by OOM->SIGKILL previously. A gross extrapolation - "right now" the "memory used" seems to be close enough to "sum of RSS of ceph-osd processes" running on the machines. -KJ On Thu, Mar 1, 2018 at 7:18 PM, Alex Gorbachev < [ mailto:ag@xxxxxxxxxxxxxxxxxxx | ag@xxxxxxxxxxxxxxxxxxx ] > wrote: BQ_BEGIN On Thu, Mar 1, 2018 at 5:37 PM, Subhachandra Chandra < [ mailto:schandra@xxxxxxxxxxxx | schandra@xxxxxxxxxxxx ] > wrote: > Even with bluestore we saw memory usage plateau at 3-4GB with 8TB drives > filled to around 90%. One thing that does increase memory usage is the > number of clients simultaneously sending write requests to a particular > primary OSD if the write sizes are large. We have not seen a memory increase in Ubuntu 16.04, but I also observed repeatedly the following phenomenon: When doing a VMotion in ESXi of a large 3TB file (this generates a log of IO requests of small size) to a Ceph pool with compression set to force, after some time the Ceph cluster shows a large number of blocked requests and eventually timeouts become very large (to the point where ESXi aborts the IO due to timeouts). After abort, the blocked/slow requests messages disappear. There are no OSD errors. I have OSD logs if anyone is interested. This does not occur when compression is unset. -- Alex Gorbachev Storcium > > Subhachandra > > On Thu, Mar 1, 2018 at 6:18 AM, David Turner < [ mailto:drakonstein@xxxxxxxxx | drakonstein@xxxxxxxxx ] > wrote: >> >> With default memory settings, the general rule is 1GB ram/1TB OSD. If you >> have a 4TB OSD, you should plan to have at least 4GB ram. This was the >> recommendation for filestore OSDs, but it was a bit much memory for the >> OSDs. From what I've seen, this rule is a little more appropriate with >> bluestore now and should still be observed. >> >> Please note that memory usage in a HEALTH_OK cluster is not the same >> amount of memory that the daemons will use during recovery. I have seen >> deployments with 4x memory usage during recovery. >> >> On Thu, Mar 1, 2018 at 8:11 AM Stefan Kooman < [ mailto:stefan@xxxxxx | stefan@xxxxxx ] > wrote: >>> >>> Quoting Caspar Smit ( [ mailto:casparsmit@xxxxxxxxxxx | casparsmit@xxxxxxxxxxx ] ): >>> > Stefan, >>> > >>> > How many OSD's and how much RAM are in each server? >>> >>> Currently 7 OSDs, 128 GB RAM. Max wil be 10 OSDs in these servers. 12 >>> cores (at least one core per OSD). >>> >>> > bluestore_cache_size=6G will not mean each OSD is using max 6GB RAM >>> > right? >>> >>> Apparently. Sure they will use more RAM than just cache to function >>> correctly. I figured 3 GB per OSD would be enough ... >>> >>> > Our bluestore hdd OSD's with bluestore_cache_size at 1G use ~4GB of >>> > total >>> > RAM. The cache is a part of the memory usage by bluestore OSD's. >>> >>> A factor 4 is quite high, isn't it? Where is all this RAM used for >>> besides cache? RocksDB? >>> >>> So how should I size the amount of RAM in a OSD server for 10 bluestore >>> SSDs in a >>> replicated setup? >>> >>> Thanks, >>> >>> Stefan >>> >>> -- >>> | BIT BV [ http://www.bit.nl/ | http://www.bit.nl/ ] Kamer van Koophandel 09090351 >>> | GPG: 0xD14839C6 [ tel:%2B31%20318%20648%20688 | +31 318 648 688 ] / [ mailto:info@xxxxxx | info@xxxxxx ] >>> _______________________________________________ >>> ceph-users mailing list >>> [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] >>> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] >> >> >> _______________________________________________ >> ceph-users mailing list >> [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] >> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] >> > > > _______________________________________________ > ceph-users mailing list > [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] > [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] > _______________________________________________ ceph-users mailing list [ mailto:ceph-users@xxxxxxxxxxxxxx | ceph-users@xxxxxxxxxxxxxx ] [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] -- Kjetil Joergensen < [ mailto:kjetil@xxxxxxxxxxxx | kjetil@xxxxxxxxxxxx ] > SRE, Medallia Inc BQ_END -- Kjetil Joergensen < [ mailto:kjetil@xxxxxxxxxxxx | kjetil@xxxxxxxxxxxx ] > SRE, Medallia Inc _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com