On 4 Mar 2019, at 16.09, Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
Bloated to ~4 GB per OSD and you are on HDDs?
Something like that yes.
13.2.3 backported the cache auto-tuning which targets 4 GB memory
usage by default.
See https://ceph.com/releases/13-2-4-mimic-released/
Right, thanks…
The bluestore_cache_* options are no longer needed. They are replaced
by osd_memory_target, defaulting to 4GB. BlueStore will expand
and contract its cache to attempt to stay within this
limit. Users upgrading should note this is a higher default
than the previous bluestore_cache_size of 1GB, so OSDs using
BlueStore will use more memory by default.
For more details, see the BlueStore docs.
Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD just makes the OSD dump like this:
[osd]
; this key makes 13.2.4 OSDs abort???
osd memory target = 1073741824
; other OSD key settings
osd pool default size = 2 # Write an object 2 times.
osd pool default min size = 1 # Allow writing one copy in a degraded state.
osd pool default pg num = 256
osd pool default pgp num = 256
client cache size = 131072
osd client op priority = 40
osd op threads = 8
osd client message size cap = 512
filestore min sync interval = 10
filestore max sync interval = 60
recovery max active = 2
recovery op priority = 30
osd max backfills = 2
osd log snippet:
-472> 2019-03-05 08:36:02.233 7f2743a8c1c0 1 -- - start start
-471> 2019-03-05 08:36:02.234 7f2743a8c1c0 2 osd.12 0 init /var/lib/ceph/osd/ceph-12 (looks like hdd)
-470> 2019-03-05 08:36:02.234 7f2743a8c1c0 2 osd.12 0 journal /var/lib/ceph/osd/ceph-12/journal
-469> 2019-03-05 08:36:02.234 7f2743a8c1c0 1 bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
-468> 2019-03-05 08:36:02.235 7f2743a8c1c0 1 bdev create path /var/lib/ceph/osd/ceph-12/block type kernel
-467> 2019-03-05 08:36:02.235 7f2743a8c1c0 1 bdev(0x55b31af4a000 /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
-466> 2019-03-05 08:36:02.236 7f2743a8c1c0 1 bdev(0x55b31af4a000 /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 GiB) block_size 4096 (4 KiB) rotational
-465> 2019-03-05 08:36:02.236 7f2743a8c1c0 1 bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824 meta 0.4 kv 0.4 data 0.2
-464> 2019-03-05 08:36:02.237 7f2743a8c1c0 1 bdev create path /var/lib/ceph/osd/ceph-12/block type kernel
-463> 2019-03-05 08:36:02.237 7f2743a8c1c0 1 bdev(0x55b31af4aa80 /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
-462> 2019-03-05 08:36:02.238 7f2743a8c1c0 1 bdev(0x55b31af4aa80 /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 GiB) block_size 4096 (4 KiB) rotational
-461> 2019-03-05 08:36:02.238 7f2743a8c1c0 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-12/block size 137 GiB
-460> 2019-03-05 08:36:02.238 7f2743a8c1c0 1 bluefs mount
-459> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option compaction_readahead_size = 2097152
-458> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option compression = kNoCompression
-457> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option max_write_buffer_number = 4
-456> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option min_write_buffer_number_to_merge = 1
-455> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option recycle_log_file_num = 4
-454> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option writable_file_max_buffer_size = 0
-453> 2019-03-05 08:36:02.339 7f2743a8c1c0 0 set rocksdb option write_buffer_size = 268435456
-452> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option compaction_readahead_size = 2097152
-451> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option compression = kNoCompression
-450> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option max_write_buffer_number = 4
-449> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option min_write_buffer_number_to_merge = 1
-448> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option recycle_log_file_num = 4
-447> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option writable_file_max_buffer_size = 0
-446> 2019-03-05 08:36:02.340 7f2743a8c1c0 0 set rocksdb option write_buffer_size = 268435456
-445> 2019-03-05 08:36:02.340 7f2743a8c1c0 1 rocksdb: do_open column families: [default]
-444> 2019-03-05 08:36:02.341 7f2743a8c1c0 4 rocksdb: RocksDB version: 5.13.0
-443> 2019-03-05 08:36:02.342 7f2743a8c1c0 4 rocksdb: Git sha rocksdb_build_git_sha:@0@
-442> 2019-03-05 08:36:02.342 7f2743a8c1c0 4 rocksdb: Compile date Jan 4 2019
...
-271> 2019-03-05 08:36:02.431 7f2743a8c1c0 1 freelist init
-270> 2019-03-05 08:36:02.535 7f2743a8c1c0 1 bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc opening allocation metadata
-269> 2019-03-05 08:36:02.714 7f2743a8c1c0 1 bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc loaded 93 GiB in 31828 extents
-268> 2019-03-05 08:36:02.722 7f2743a8c1c0 2 osd.12 0 journal looks like hdd
-267> 2019-03-05 08:36:02.722 7f2743a8c1c0 2 osd.12 0 boot
-266> 2019-03-05 08:36:02.723 7f272a0f3700 5 bluestore.MempoolThread(0x55b31af46a30) _tune_cache_size target: 1073741824 heap: 64675840 unmapped: 786432 mapped: 63889408 old cache_size: 134217728 new cache size: 17349132402135320576
-265> 2019-03-05 08:36:02.723 7f272a0f3700 5 bluestore.MempoolThread(0x55b31af46a30) _trim_shards cache_size: 17349132402135320576 kv_alloc: 134217728 kv_used: 5099462 meta_alloc: 0 meta_used: 21301 data_alloc: 0 data_used: 0
...
2019-03-05 08:36:40.166 7f03fc57f700 1 osd.12 pg_epoch: 7063 pg[2.93( v 6687'5 (0'0,6687'5] local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 les/c/f 7016/7016/0 7063/7063/7063) [12,19] r=0 lpr=7063 pi=[7015,7063)/1 crt=6687'5 lcod 0'0 mlcod 0'0 unknown NOTIFY mbc={}] start_peering_interval up [19] -> [12,19], acting [19] -> [12,19], acting_primary 19 -> 12, up_primary 19 -> 12, role -1 -> 0, features acting 4611087854031142907 upacting 4611087854031142907
2019-03-05 08:36:40.167 7f03fc57f700 1 osd.12 pg_epoch: 7063 pg[2.93( v 6687'5 (0'0,6687'5] local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 les/c/f 7016/7016/0 7063/7063/7063) [12,19] r=0 lpr=7063 pi=[7015,7063)/1 crt=6687'5 lcod 0'0 mlcod 0'0 unknown mbc={}] state<Start>: transitioning to Primary
2019-03-05 08:36:40.167 7f03fb57d700 1 osd.12 pg_epoch: 7061 pg[2.40( v 6964'703 (0'0,6964'703] local-lis/les=6999/7000 n=1 ec=103/103 lis/c 6999/6999 les/c/f 7000/7000/0 7061/7061/6999) [8] r=-1 lpr=7061 pi=[6999,7061)/1 crt=6964'703 lcod 0'0 unknown mbc={}] start_peering_interval up [8,12] -> [8], acting [8,12] -> [8], acting_primary 8 -> 8, up_primary 8 -> 8, role 1 -> -1, features acting 4611087854031142907 upacting 4611087854031142907
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.12.log
--- end dump of recent events ---
2019-03-05 08:36:07.750 7f272a0f3700 -1 *** Caught signal (Aborted) **
in thread 7f272a0f3700 thread_name:bstore_mempool
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: (()+0x911e70) [0x55b318337e70]
2: (()+0xf5d0) [0x7f2737a4e5d0]
3: (gsignal()+0x37) [0x7f2736a6f207]
4: (abort()+0x148) [0x7f2736a708f8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x242) [0x7f273aec62b2]
6: (()+0x25a337) [0x7f273aec6337]
7: (()+0x7a886e) [0x55b3181ce86e]
8: (BlueStore::MempoolThread::entry()+0x3b0) [0x55b3181d0060]
9: (()+0x7dd5) [0x7f2737a46dd5]
10: (clone()+0x6d) [0x7f2736b36ead]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Even without the ‘osd memory target’ conf key, OSD claims on start:
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824
Any hints appreciated!
/Steffen
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Mon, Mar 4, 2019 at 3:55 PM Steffen Winther Sørensen
<stefws@xxxxxxxxx> wrote:
List Members,
patched a centos 7 based cluster from 13.2.2 to 13.2.4 last monday, everything appeared working fine.
Only this morning I found all OSDs in the cluster to be bloated in memory foot print, possible after weekend backup through MDS.
Anyone else seeing possible memory leak in 13.2.4 OSD possible primarily when using MDS?
TIA
/Steffen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com