I'm having troubles changing osd_memory_target on my test
cluster. I've upgraded whole cluster from luminous to nautiuls,
all OSDs are running bluestore. Because this testlab is short in
RAM, I wanted to lower osd_memory_target to save some memory.
# ceph version
ceph version 14.2.6
(f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
# ceph config set osd osd_memory_target 2147483648
# ceph config dump
WHO MASK LEVEL OPTION
VALUE RO
mon advanced auth_client_required
cephx *
mon advanced auth_cluster_required
cephx *
mon advanced auth_service_required
cephx *
mon advanced mon_allow_pool_delete true
mon advanced mon_max_pg_per_osd 500
mgr advanced mgr/balancer/active true
mgr advanced mgr/balancer/mode
crush-compat
osd advanced osd_crush_update_on_start true
osd advanced osd_max_backfills 4
osd basic osd_memory_target
2147483648
Now any OSD is unable to start/restart:
# /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph
--setgroup ceph
LOG /var/log/ceph/ceph-osd.0.log:
min_mon_release 14 (nautilus)
0: [v2:10.0.92.69:3300/0,v1:10.0.92.69:6789/0]
mon.testlab-ceph-03
1: [v2:10.0.92.72:3300/0,v1:10.0.92.72:6789/0]
mon.testlab-ceph-04
2: [v2:10.0.92.67:3300/0,v1:10.0.92.67:6789/0]
mon.testlab-ceph-01
3: [v2:10.0.92.68:3300/0,v1:10.0.92.68:6789/0]
mon.testlab-ceph-02
-54> 2020-01-21 11:45:19.289 7f6aa5d78700 1
monclient: mon.2 has (v2) addrs
[v2:10.0.92.67:3300/0,v1:10.0.92.67:6789/0] but i'm connected
to v1:10.0.92.67:6789/0, reconnecting
-53> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: _reopen_session rank -1
-52> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): picked mon.testlab-ceph-01 con
0x563319682880 addr
[v2:10.0.92.67:3300/0,v1:10.0.92.67:6789/0]
-51> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): picked mon.testlab-ceph-04 con
0x563319682d00 addr
[v2:10.0.92.72:3300/0,v1:10.0.92.72:6789/0]
-50> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): picked mon.testlab-ceph-02 con
0x563319683180 addr
[v2:10.0.92.68:3300/0,v1:10.0.92.68:6789/0]
-49> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): start opening mon connection
-48> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): start opening mon connection
-47> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): start opening mon connection
-46> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient(hunting): _renew_subs
-45> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): get_auth_request con 0x563319682880
auth_method 0
-44> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): get_auth_request method 2 preferred_modes
[1,2]
-43> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): _init_auth method 2
-42> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): handle_auth_reply_more payload 9
-41> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): handle_auth_reply_more payload_len 9
-40> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): handle_auth_reply_more responding with 36
bytes
-39> 2020-01-21 11:45:19.289 7f6aa6579700 10
monclient(hunting): get_auth_request con 0x563319682d00
auth_method 0
-38> 2020-01-21 11:45:19.289 7f6aa6579700 10
monclient(hunting): get_auth_request method 2 preferred_modes
[1,2]
-37> 2020-01-21 11:45:19.289 7f6aa6579700 10
monclient(hunting): _init_auth method 2
-36> 2020-01-21 11:45:19.289 7f6aa757b700 10
monclient(hunting): get_auth_request con 0x563319683180
auth_method 0
-35> 2020-01-21 11:45:19.289 7f6aa757b700 10
monclient(hunting): get_auth_request method 2 preferred_modes
[1,2]
-34> 2020-01-21 11:45:19.289 7f6aa757b700 10
monclient(hunting): _init_auth method 2
-33> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient(hunting): handle_auth_done global_id 5638238 payload
386
-32> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient: _finish_hunting 0
-31> 2020-01-21 11:45:19.289 7f6aa6d7a700 1
monclient: found mon.testlab-ceph-01
-30> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient: _send_mon_message to mon.testlab-ceph-01 at
v2:10.0.92.67:3300/0
-29> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient: _finish_auth 0
-28> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient: _check_auth_rotating renewing rotating keys (they
expired before 2020-01-21 11:44:49.293059)
-27> 2020-01-21 11:45:19.289 7f6aa6d7a700 10
monclient: _send_mon_message to mon.testlab-ceph-01 at
v2:10.0.92.67:3300/0
-26> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: handle_monmap mon_map magic: 0 v1
-25> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: got monmap 17 from mon.testlab-ceph-01 (according
to old e17)
-24> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: dump:
epoch 17
fsid f42082cc-c35a-44fe-b7ef-c2eb2ff1fe43
last_changed 2020-01-20 10:35:23.579081
created 2018-04-25 17:07:31.881451
min_mon_release 14 (nautilus)
0: [v2:10.0.92.69:3300/0,v1:10.0.92.69:6789/0]
mon.testlab-ceph-03
1: [v2:10.0.92.72:3300/0,v1:10.0.92.72:6789/0]
mon.testlab-ceph-04
2: [v2:10.0.92.67:3300/0,v1:10.0.92.67:6789/0]
mon.testlab-ceph-01
3: [v2:10.0.92.68:3300/0,v1:10.0.92.68:6789/0]
mon.testlab-ceph-02
-23> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: handle_config config(3 keys) v1
-22> 2020-01-21 11:45:19.289 7f6aa7781c80 10
monclient: get_monmap_and_config success
-21> 2020-01-21 11:45:19.289 7f6aa7781c80 10
monclient: shutdown
-20> 2020-01-21 11:45:19.289 7f6aa4575700 4
set_mon_vals no callback set
-19> 2020-01-21 11:45:19.289 7f6aa5d78700 10
monclient: discarding stray monitor message mon_map magic: 0
v1
-18> 2020-01-21 11:45:19.289 7f6aa4575700 10
set_mon_vals osd_crush_update_on_start = true
-17> 2020-01-21 11:45:19.289 7f6aa4575700 10
set_mon_vals osd_max_backfills = 4
-16> 2020-01-21 11:45:19.289 7f6aa4575700 10
set_mon_vals osd_memory_target = 2147483648
-15> 2020-01-21 11:45:19.297 7f6aa7781c80 0 set
uid:gid to 64045:64045 (ceph:ceph)
-14> 2020-01-21 11:45:19.297 7f6aa7781c80 0 ceph
version 14.2.6 (f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9)
nautilus (stable), process ceph-osd, pid 728019
-13> 2020-01-21 11:45:20.681 7f6aa7781c80 0
pidfile_write: ignore empty --pid-file
-12> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) init /var/run/ceph/ceph-osd.0.asok
-11> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) bind_and_listen
/var/run/ceph/ceph-osd.0.asok
-10> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) register_command 0 hook 0x5633196003f0
-9> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) register_command version hook
0x5633196003f0
-8> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) register_command git_version hook
0x5633196003f0
-7> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) register_command help hook 0x563319602220
-6> 2020-01-21 11:45:20.685 7f6aa7781c80 5
asok(0x563319688000) register_command get_command_descriptions
hook 0x563319602260
-5> 2020-01-21 11:45:20.685 7f6aa4d76700 5
asok(0x563319688000) entry start
-4> 2020-01-21 11:45:20.685 7f6aa7781c80 5
object store type is bluestore
-3> 2020-01-21 11:45:20.689 7f6aa7781c80 1 bdev
create path /var/lib/ceph/osd/ceph-0/block type kernel
-2> 2020-01-21 11:45:20.689 7f6aa7781c80 1
bdev(0x56331a2d8000 /var/lib/ceph/osd/ceph-0/block) open path
/var/lib/ceph/osd/ceph-0/block
-1> 2020-01-21 11:45:20.689 7f6aa7781c80 1
bdev(0x56331a2d8000 /var/lib/ceph/osd/ceph-0/block) open size
2000381018112 (0x1d1c0000000, 1.8 TiB) block_size 4096 (4 KiB)
rotational discard not supported
0> 2020-01-21 11:45:20.693 7f6aa7781c80 -1 ***
Caught signal (Aborted) **
in thread 7f6aa7781c80 thread_name:ceph-osd
ceph version 14.2.6
(f0aa067ac7a02ee46ea48aa26c6e298b5ea272e9) nautilus (stable)
1: (()+0x12730) [0x7f6aa8229730]
2: (gsignal()+0x10b) [0x7f6aa7d0d7bb]
3: (abort()+0x121) [0x7f6aa7cf8535]
4: (()+0x8c983) [0x7f6aa80c0983]
5: (()+0x928c6) [0x7f6aa80c68c6]
6: (()+0x92901) [0x7f6aa80c6901]
7: (()+0x92b34) [0x7f6aa80c6b34]
8: (()+0x5a3f53) [0x56330f0a0f53]
9: (Option::size_t const
md_config_t::get_val<Option::size_t>(ConfigValues
const&, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >
const&) const+0x81) [0x56330f0a6c91]
10: (BlueStore::_set_cache_sizes()+0x15a)
[0x56330f521d8a]
11: (BlueStore::_open_bdev(bool)+0x173)
[0x56330f524b23]
12:
(BlueStore::get_devices(std::set<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::less<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >
>, std::allocator<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >
> >*)+0xef) [0x56330f58b7ef]
13: (BlueStore::get_numa_node(int*, std::set<int,
std::less<int>, std::allocator<int> >*,
std::set<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::less<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >
>, std::allocator<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >
> >*)+0x7b) [0x56330f53371b]
14: (main()+0x2870) [0x56330f06e440]
15: (__libc_start_main()+0xeb) [0x7f6aa7cfa09b]
16: (_start()+0x2a) [0x56330f0a0c6a]
NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.0.log
--- end dump of recent events ---
When I remove this option:
# ceph config rm osd osd_memory_target
OSD starts without any trouble. I've seen same behaviour when I
wrote this parameter into /etc/ceph/ceph.conf
Is this a known bug? Am I doing something wrong?
Any help appreciated.
Best Regards,
Martin
-- Martin Mlynář
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com