----- Message from Gregory Farnum <greg at inktank.com> --------- Date: Wed, 21 May 2014 15:46:17 -0700 From: Gregory Farnum <greg at inktank.com> Subject: Re: Expanding pg's of an erasure coded pool To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be> Cc: ceph-users <ceph-users at lists.ceph.com> > On Wed, May 21, 2014 at 3:52 AM, Kenneth Waegeman > <Kenneth.Waegeman at ugent.be> wrote: >> Thanks! I increased the max processes parameter for all daemons quite a lot >> (until ulimit -u 3802720) >> >> These are the limits for the daemons now.. >> [root@ ~]# cat /proc/17006/limits >> Limit Soft Limit Hard Limit Units >> Max cpu time unlimited unlimited seconds >> Max file size unlimited unlimited bytes >> Max data size unlimited unlimited bytes >> Max stack size 10485760 unlimited bytes >> Max core file size unlimited unlimited bytes >> Max resident set unlimited unlimited bytes >> Max processes 3802720 3802720 >> processes >> Max open files 32768 32768 files >> Max locked memory 65536 65536 bytes >> Max address space unlimited unlimited bytes >> Max file locks unlimited unlimited locks >> Max pending signals 95068 95068 signals >> Max msgqueue size 819200 819200 bytes >> Max nice priority 0 0 >> Max realtime priority 0 0 >> Max realtime timeout unlimited unlimited us >> >> But this didn't help. Are there other parameters I should change? > > Hrm, is it exactly the same stack trace? You might need to bump the > open files limit as well, although I'd be surprised. :/ I increased the open file limit as test to 128000, still the same results. Stack trace: -16> 2014-05-22 11:10:05.262456 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6293/6310/6293) [255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14 lpr=6310 pi=6293-6309/1 crt=0'0 active] exit Started/ReplicaActive/RepNotRecovering 52.314752 4 0.000408 -15> 2014-05-22 11:10:05.262649 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6293/6310/6293) [255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14 lpr=6310 pi=6293-6309/1 crt=0'0 active] exit Started/ReplicaActive 52.315020 0 0.000000 -14> 2014-05-22 11:10:05.262667 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6293/6310/6293) [255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14 lpr=6310 pi=6293-6309/1 crt=0'0 active] exit Started 55.181842 0 0.000000 -13> 2014-05-22 11:10:05.262681 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6293/6310/6293) [255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14 lpr=6310 pi=6293-6309/1 crt=0'0 active] enter Reset -12> 2014-05-22 11:10:05.262797 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] exit Reset 0.000117 1 0.000338 -11> 2014-05-22 11:10:05.262956 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Started -10> 2014-05-22 11:10:05.262983 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Start -9> 2014-05-22 11:10:05.262994 7f3bfcaee700 1 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] state<Start>: transitioning to Stray -8> 2014-05-22 11:10:05.263151 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] exit Start 0.000169 0 0.000000 -7> 2014-05-22 11:10:05.263385 7f3bfcaee700 5 osd.398 pg_epoch: 6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326 6327/6327/6327) [200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1 lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Started/Stray -6> 2014-05-22 11:10:05.264331 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.192:6910/26664 -- pg_notify(16.8f5s14(2) epoch 6327) v5 -- ?+0 0x6a396c0 con 0x5db77a0 -5> 2014-05-22 11:10:05.264551 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.192:6829/30080 -- pg_notify(16.b51s1(2) epoch 6327) v5 -- ?+0 0x6964280 con 0x43f11e0 -4> 2014-05-22 11:10:05.264894 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.122:6819/31185 -- pg_notify(16.f2es11(2) epoch 6327) v5 -- ?+0 0x790cd00 con 0x50ed280 -3> 2014-05-22 11:10:05.313524 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.122:6819/31185 -- pg_notify(16.e4es3(2) epoch 6327) v5 -- ?+0 0x682b9c0 con 0x50ed280 -2> 2014-05-22 11:10:05.314115 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.182:6864/15414 -- pg_notify(16.c07s10(2) epoch 6327) v5 -- ?+0 0x790de80 con 0x537a940 -1> 2014-05-22 11:10:05.314420 7f3bfcaee700 1 -- 130.246.178.141:6928/31869 --> 130.246.179.192:6910/26664 -- pg_notify(16.8f5s14(2) epoch 6327) v5 -- ?+0 0x71ee740 con 0x5db77a0 0> 2014-05-22 11:10:05.322346 7f3c010f5700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f3c010f5700 time 2014-05-22 11:10:05.320827 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa] 3: (Accepter::entry()+0x265) [0xb3ca45] 4: (()+0x79d1) [0x7f3c16e559d1] 5: (clone()+0x6d) [0x7f3c15b90b6d] 2014-05-22 11:10:05.335487 7f3bffcf3700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f3bffcf3700 time 2014-05-22 11:10:05.334302 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa] 3: (Accepter::entry()+0x265) [0xb3ca45] 4: (()+0x79d1) [0x7f3c16e559d1] 5: (clone()+0x6d) [0x7f3c15b90b6d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2014-05-22 11:10:05.718101 7f3b99324700 0 -- 130.246.178.141:6928/31869 >> 130.246.178.141:6817/15419 pipe(0x673eb80 sd=896 :6928 s=0 pgs=0 cs=0 l=0 c=0x68e5c 20).accept connect_seq 0 vs existing 0 state connecting 2014-05-22 11:10:05.720679 7f3b99324700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f3b99324700 time 2014-05-22 11:10:05.719300 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (Pipe::accept()+0x493a) [0xb305fa] 3: (Pipe::reader()+0x1b9e) [0xb3350e] 4: (Pipe::Reader::entry()+0xd) [0xb359ed] 5: (()+0x79d1) [0x7f3c16e559d1] 6: (clone()+0x6d) [0x7f3c15b90b6d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2014-05-22 11:10:06.033702 7f3b99223700 0 -- 130.246.178.141:6928/31869 >> 130.246.179.168:6906/13921 pipe(0x673a800 sd=897 :6928 s=0 pgs=0 cs=0 l=0 c=0x68e7d 20).accept connect_seq 0 vs existing 0 state wait --- begin dump of recent events --- -12> 2014-05-22 11:10:05.322880 7f3c100ba700 5 osd.398 6327 tick -11> 2014-05-22 11:10:05.325790 7f3c100ba700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f3c100ba700 time 2014-05-22 11:10:05.323 100 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (SimpleMessenger::connect_rank(entity_addr_t const&, int, Connection*, Message*)+0x17b) [0xa2c1bb] 3: (SimpleMessenger::get_connection(entity_inst_t const&)+0x180) [0xa30860] 4: (OSDService::get_con_osd_hb(int, unsigned int)+0x189) [0x5fe3d9] 5: (OSD::_add_heartbeat_peer(int)+0xa9) [0x5fe6b9] 6: (OSD::maybe_update_heartbeat_peers()+0x7be) [0x60b59e] 7: (OSD::tick()+0x217) [0x64fe17] 8: (Context::complete(int)+0x9) [0x660579] 9: (SafeTimer::timer_thread()+0x453) [0xab20a3] 10: (SafeTimerThread::entry()+0xd) [0xab425d] 11: (()+0x79d1) [0x7f3c16e559d1] 12: (clone()+0x6d) [0x7f3c15b90b6d] Monitor -14> 2014-05-22 11:19:33.849633 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62148 ==== forward(mon_command({"prefix": "osd cr ush create-or-move", "args": ["host=gdss523", "root=default"], "id": 401, "weight": 1.8200000000000001} v 0) v1 caps allow profile osd tid 60304 con_features 8 796093022207) to leader v2 ==== 429+0+0 (3816634502 0 0) 0x26c77580 con 0x2361760 -13> 2014-05-22 11:19:33.849726 7fa7d9a8e700 0 mon.gdss514 at 0(leader) e2 handle_command mon_command({"prefix": "osd crush create-or-move", "args": ["host=gd ss523", "root=default"], "id": 401, "weight": 1.8200000000000001} v 0) v1 -12> 2014-05-22 11:19:33.849815 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62149 ==== forward(osd_alive(want up_thru 6383 ha ve 6401) v1 caps allow profile osd tid 60305 con_features 8796093022207) to leader v2 ==== 288+0+0 (3476969871 0 0) 0x2a9cb480 con 0x2361760 -11> 2014-05-22 11:19:33.849869 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62150 ==== forward(osd_alive(want up_thru 6402 ha ve 6402) v1 caps allow profile osd tid 60306 con_features 8796093022207) to leader v2 ==== 288+0+0 (4190502222 0 0) 0x36f2f80 con 0x2361760 -10> 2014-05-22 11:19:33.849892 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62151 ==== forward(osd_pgtemp(e6402 {2.24=[]} v64 02) v1 caps allow profile osd tid 60307 con_features 8796093022207) to leader v2 ==== 313+0+0 (154505712 0 0) 0x2500f00 con 0x2361760 -9> 2014-05-22 11:19:33.849912 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62152 ==== forward(osd_pgtemp(e6402 {2.24=[]} v64 02) v1 caps allow profile osd tid 60308 con_features 8796093022207) to leader v2 ==== 313+0+0 (2579287395 0 0) 0x29abaa80 con 0x2361760 -8> 2014-05-22 11:19:33.849954 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62153 ==== forward(osd_alive(want up_thru 6402 ha ve 6402) v1 caps allow profile osd tid 60309 con_features 8796093022207) to leader v2 ==== 288+0+0 (28071569 0 0) 0x2b1d5f00 con 0x2361760 -7> 2014-05-22 11:19:34.063692 7fa7da48f700 5 mon.gdss514 at 0(leader).paxos(paxos updating c 42420..43081) queue_proposal bl 42272 bytes; ctx = 0x29dc0470 -6> 2014-05-22 11:19:34.063725 7fa7da48f700 5 mon.gdss514 at 0(leader).paxos(paxos updating c 42420..43081) propose_new_value not active; proposal queued -5> 2014-05-22 11:19:34.422715 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 <== osd.76 130.246.179.181:6880/10080 10 ==== mon_subscribe({monmap=3+,osd_pg_crea tes=0,osdmap=6399}) v2 ==== 69+0+0 (1136537960 0 0) 0x217508c0 con 0x2ac740a0 -4> 2014-05-22 11:19:34.422782 7fa7d9a8e700 5 mon.gdss514 at 0(leader).osd e6402 send_incremental [6399..6402] to osd.76 130.246.179.181:6880/10080 -3> 2014-05-22 11:19:34.423020 7fa7d9a8e700 1 -- 130.246.179.122:6789/0 --> osd.76 130.246.179.181:6880/10080 -- osd_map(6399..6402 src has 5747..6402) v3 -- ?+0 0x27375c40 -2> 2014-05-22 11:19:34.649756 7fa7c8848700 2 -- 130.246.179.122:6789/0 >> 130.246.179.88:6880/17558 pipe(0x2b5d2080 sd=32 :6789 s=2 pgs=1 cs=1 l=1 c=0x2a d40dc0).reader couldn't read tag, (0) Success -1> 2014-05-22 11:19:34.649822 7fa7c8848700 2 -- 130.246.179.122:6789/0 >> 130.246.179.88:6880/17558 pipe(0x2b5d2080 sd=32 :6789 s=2 pgs=1 cs=1 l=1 c=0x2a d40dc0).fault (0) Success 0> 2014-05-22 11:19:35.186091 7fa7d908d700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fa7d908d700 time 2014-05-22 11:19:33.914 301 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0x748c9a] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba] 3: (Accepter::entry()+0x265) [0x863295] 4: (()+0x79d1) [0x7fa7dfdf89d1] 5: (clone()+0x6d) [0x7fa7deb33b6d] ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: /usr/bin/ceph-mon() [0x86b991] 2: (()+0xf710) [0x7fa7dfe00710] 3: (gsignal()+0x35) [0x7fa7dea7d925] 4: (abort()+0x175) [0x7fa7dea7f105] 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fa7df337a5d] 6: (()+0xbcbe6) [0x7fa7df335be6] 7: (()+0xbcc13) [0x7fa7df335c13] 8: (()+0xbcd0e) [0x7fa7df335d0e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f2) [0x7a5472] 10: (Thread::create(unsigned long)+0x8a) [0x748c9a] 11: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba] 12: (Accepter::entry()+0x265) [0x863295] 13: (()+0x79d1) [0x7fa7dfdf89d1] 14: (clone()+0x6d) [0x7fa7deb33b6d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -4> 2014-05-22 11:19:35.237629 7fa7d2da1700 2 -- 130.246.179.122:6789/0 >> 130.246.179.184:6880/30668 pipe(0x2b5d0500 sd=25 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b2aa0).reader couldn't read tag, (0) Success -3> 2014-05-22 11:19:35.237684 7fa7d2da1700 2 -- 130.246.179.122:6789/0 >> 130.246.179.184:6880/30668 pipe(0x2b5d0500 sd=25 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b2aa0).fault (0) Success -2> 2014-05-22 11:19:35.349080 7fa7d2341700 2 -- 130.246.179.122:6789/0 >> 130.246.179.187:6895/9386 pipe(0xd256400 sd=37 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b7a60).reader couldn't read tag, (0) Success -1> 2014-05-22 11:19:35.349144 7fa7d2341700 2 -- 130.246.179.122:6789/0 >> 130.246.179.187:6895/9386 pipe(0xd256400 sd=37 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b7a60).fault (0) Success 0> 2014-05-22 11:19:35.441249 7fa7d908d700 -1 *** Caught signal (Aborted) ** in thread 7fa7d908d700 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: /usr/bin/ceph-mon() [0x86b991] 2: (()+0xf710) [0x7fa7dfe00710] 3: (gsignal()+0x35) [0x7fa7dea7d925] 4: (abort()+0x175) [0x7fa7dea7f105] 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fa7df337a5d] 6: (()+0xbcbe6) [0x7fa7df335be6] 7: (()+0xbcc13) [0x7fa7df335c13] 8: (()+0xbcd0e) [0x7fa7df335d0e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f2) [0x7a5472] 10: (Thread::create(unsigned long)+0x8a) [0x748c9a] 11: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba] 12: (Accepter::entry()+0x265) [0x863295] 13: (()+0x79d1) [0x7fa7dfdf89d1] 14: (clone()+0x6d) [0x7fa7deb33b6d] But I see some things happening on the system while doing this too: [root@ ~]# ceph osd pool set ecdata15 pgp_num 4096 set pool 16 pgp_num to 4096 [root@ ~]# ceph status Traceback (most recent call last): File "/usr/bin/ceph", line 830, in <module> sys.exit(main()) File "/usr/bin/ceph", line 590, in main conffile=conffile) File "/usr/lib/python2.6/site-packages/rados.py", line 198, in __init__ librados_path = find_library('rados') File "/usr/lib64/python2.6/ctypes/util.py", line 209, in find_library return _findSoname_ldconfig(name) or _get_soname(_findLib_gcc(name)) File "/usr/lib64/python2.6/ctypes/util.py", line 203, in _findSoname_ldconfig os.popen('LANG=C /sbin/ldconfig -p 2>/dev/null').read()) OSError: [Errno 12] Cannot allocate memory [root@ ~]# lsof | wc -bash: fork: Cannot allocate memory [root@ ~]# lsof | wc 21801 211209 3230028 [root@ ~]# ceph status ^CError connecting to cluster: InterruptedOrTimeoutError ^[[A[root@ ~]# lsof | wc 2028 17476 190947 And meanwhile the daemons has then been crashed. I verified the memory never ran out. Thanks! Kenneth > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com ----- End message from Gregory Farnum <greg at inktank.com> ----- -- Met vriendelijke groeten, Kenneth Waegeman