Expanding pg's of an erasure coded pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Message from Gregory Farnum <greg at inktank.com> ---------
    Date: Wed, 21 May 2014 15:46:17 -0700
    From: Gregory Farnum <greg at inktank.com>
Subject: Re: Expanding pg's of an erasure coded pool
      To: Kenneth Waegeman <Kenneth.Waegeman at ugent.be>
      Cc: ceph-users <ceph-users at lists.ceph.com>


> On Wed, May 21, 2014 at 3:52 AM, Kenneth Waegeman
> <Kenneth.Waegeman at ugent.be> wrote:
>> Thanks! I increased the max processes parameter for all daemons quite a lot
>> (until ulimit -u 3802720)
>>
>> These are the limits for the daemons now..
>> [root@ ~]# cat /proc/17006/limits
>> Limit                     Soft Limit           Hard Limit           Units
>> Max cpu time              unlimited            unlimited            seconds
>> Max file size             unlimited            unlimited            bytes
>> Max data size             unlimited            unlimited            bytes
>> Max stack size            10485760             unlimited            bytes
>> Max core file size        unlimited            unlimited            bytes
>> Max resident set          unlimited            unlimited            bytes
>> Max processes             3802720              3802720
>> processes
>> Max open files            32768                32768                files
>> Max locked memory         65536                65536                bytes
>> Max address space         unlimited            unlimited            bytes
>> Max file locks            unlimited            unlimited            locks
>> Max pending signals       95068                95068                signals
>> Max msgqueue size         819200               819200               bytes
>> Max nice priority         0                    0
>> Max realtime priority     0                    0
>> Max realtime timeout      unlimited            unlimited            us
>>
>> But this didn't help. Are there other parameters I should change?
>
> Hrm, is it exactly the same stack trace? You might need to bump the
> open files limit as well, although I'd be surprised. :/

I increased the open file limit as test to 128000, still the same results.

Stack trace:

    -16> 2014-05-22 11:10:05.262456 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6293/6310/6293)  
[255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14  
lpr=6310 pi=6293-6309/1 crt=0'0 active] exit  
Started/ReplicaActive/RepNotRecovering 52.314752 4 0.000408
    -15> 2014-05-22 11:10:05.262649 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6293/6310/6293)  
[255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14  
lpr=6310 pi=6293-6309/1 crt=0'0 active] exit Started/ReplicaActive  
52.315020 0 0.000000
    -14> 2014-05-22 11:10:05.262667 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6293/6310/6293)  
[255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14  
lpr=6310 pi=6293-6309/1 crt=0'0 active] exit Started 55.181842 0  
0.000000
    -13> 2014-05-22 11:10:05.262681 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6293/6310/6293)  
[255,52,147,15,402,280,129,321,125,180,301,85,22,340,398] r=14  
lpr=6310 pi=6293-6309/1 crt=0'0 active] enter Reset
    -12> 2014-05-22 11:10:05.262797 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] exit Reset 0.000117 1  
0.000338
    -11> 2014-05-22 11:10:05.262956 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Started
    -10> 2014-05-22 11:10:05.262983 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Start
     -9> 2014-05-22 11:10:05.262994 7f3bfcaee700  1 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] state<Start>:  
transitioning to Stray
     -8> 2014-05-22 11:10:05.263151 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] exit Start 0.000169 0  
0.000000
     -7> 2014-05-22 11:10:05.263385 7f3bfcaee700  5 osd.398 pg_epoch:  
6327 pg[16.8f5s14( empty local-les=6326 n=0 ec=6293 les/c 6326/6326  
6327/6327/6327)  
[200,176,57,135,107,426,234,409,264,280,338,381,317,220,79] r=-1  
lpr=6327 pi=6293-6326/2 crt=0'0 inactive NOTIFY] enter Started/Stray
     -6> 2014-05-22 11:10:05.264331 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.192:6910/26664 --  
pg_notify(16.8f5s14(2) epoch 6327) v5 -- ?+0 0x6a396c0 con 0x5db77a0
     -5> 2014-05-22 11:10:05.264551 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.192:6829/30080 --  
pg_notify(16.b51s1(2) epoch 6327) v5 -- ?+0 0x6964280 con 0x43f11e0
     -4> 2014-05-22 11:10:05.264894 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.122:6819/31185 --  
pg_notify(16.f2es11(2) epoch 6327) v5 -- ?+0 0x790cd00 con 0x50ed280
     -3> 2014-05-22 11:10:05.313524 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.122:6819/31185 --  
pg_notify(16.e4es3(2) epoch 6327) v5 -- ?+0 0x682b9c0 con 0x50ed280
     -2> 2014-05-22 11:10:05.314115 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.182:6864/15414 --  
pg_notify(16.c07s10(2) epoch 6327) v5 -- ?+0 0x790de80 con 0x537a940
     -1> 2014-05-22 11:10:05.314420 7f3bfcaee700  1 --  
130.246.178.141:6928/31869 --> 130.246.179.192:6910/26664 --  
pg_notify(16.8f5s14(2) epoch 6327) v5 -- ?+0 0x71ee740 con 0x5db77a0
      0> 2014-05-22 11:10:05.322346 7f3c010f5700 -1 common/Thread.cc:  
In function 'void Thread::create(size_t)' thread 7f3c010f5700 time  
2014-05-22 11:10:05.320827
common/Thread.cc: 110: FAILED assert(ret == 0)

  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: (Thread::create(unsigned long)+0x8a) [0xa83f8a]
  2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa]
  3: (Accepter::entry()+0x265) [0xb3ca45]
  4: (()+0x79d1) [0x7f3c16e559d1]
  5: (clone()+0x6d) [0x7f3c15b90b6d]

2014-05-22 11:10:05.335487 7f3bffcf3700 -1 common/Thread.cc: In  
function 'void Thread::create(size_t)' thread 7f3bffcf3700 time  
2014-05-22 11:10:05.334302
common/Thread.cc: 110: FAILED assert(ret == 0)

  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: (Thread::create(unsigned long)+0x8a) [0xa83f8a]
  2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa]
  3: (Accepter::entry()+0x265) [0xb3ca45]
  4: (()+0x79d1) [0x7f3c16e559d1]
  5: (clone()+0x6d) [0x7f3c15b90b6d]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.

2014-05-22 11:10:05.718101 7f3b99324700  0 --  
130.246.178.141:6928/31869 >> 130.246.178.141:6817/15419  
pipe(0x673eb80 sd=896 :6928 s=0 pgs=0 cs=0 l=0 c=0x68e5c
20).accept connect_seq 0 vs existing 0 state connecting
2014-05-22 11:10:05.720679 7f3b99324700 -1 common/Thread.cc: In  
function 'void Thread::create(size_t)' thread 7f3b99324700 time  
2014-05-22 11:10:05.719300
common/Thread.cc: 110: FAILED assert(ret == 0)

  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: (Thread::create(unsigned long)+0x8a) [0xa83f8a]
  2: (Pipe::accept()+0x493a) [0xb305fa]
  3: (Pipe::reader()+0x1b9e) [0xb3350e]
  4: (Pipe::Reader::entry()+0xd) [0xb359ed]
  5: (()+0x79d1) [0x7f3c16e559d1]
  6: (clone()+0x6d) [0x7f3c15b90b6d]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.

2014-05-22 11:10:06.033702 7f3b99223700  0 --  
130.246.178.141:6928/31869 >> 130.246.179.168:6906/13921  
pipe(0x673a800 sd=897 :6928 s=0 pgs=0 cs=0 l=0 c=0x68e7d
20).accept connect_seq 0 vs existing 0 state wait
--- begin dump of recent events ---
    -12> 2014-05-22 11:10:05.322880 7f3c100ba700  5 osd.398 6327 tick
    -11> 2014-05-22 11:10:05.325790 7f3c100ba700 -1 common/Thread.cc:  
In function 'void Thread::create(size_t)' thread 7f3c100ba700 time  
2014-05-22 11:10:05.323
100
common/Thread.cc: 110: FAILED assert(ret == 0)

  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: (Thread::create(unsigned long)+0x8a) [0xa83f8a]
  2: (SimpleMessenger::connect_rank(entity_addr_t const&, int,  
Connection*, Message*)+0x17b) [0xa2c1bb]
  3: (SimpleMessenger::get_connection(entity_inst_t const&)+0x180) [0xa30860]
  4: (OSDService::get_con_osd_hb(int, unsigned int)+0x189) [0x5fe3d9]
  5: (OSD::_add_heartbeat_peer(int)+0xa9) [0x5fe6b9]
  6: (OSD::maybe_update_heartbeat_peers()+0x7be) [0x60b59e]
  7: (OSD::tick()+0x217) [0x64fe17]
  8: (Context::complete(int)+0x9) [0x660579]
  9: (SafeTimer::timer_thread()+0x453) [0xab20a3]
  10: (SafeTimerThread::entry()+0xd) [0xab425d]
  11: (()+0x79d1) [0x7f3c16e559d1]
  12: (clone()+0x6d) [0x7f3c15b90b6d]


Monitor


    -14> 2014-05-22 11:19:33.849633 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62148 ====  
forward(mon_command({"prefix": "osd cr
ush create-or-move", "args": ["host=gdss523", "root=default"], "id":  
401, "weight": 1.8200000000000001} v 0) v1 caps allow profile osd tid  
60304 con_features 8
796093022207) to leader v2 ==== 429+0+0 (3816634502 0 0) 0x26c77580  
con 0x2361760
    -13> 2014-05-22 11:19:33.849726 7fa7d9a8e700  0  
mon.gdss514 at 0(leader) e2 handle_command mon_command({"prefix": "osd  
crush create-or-move", "args": ["host=gd
ss523", "root=default"], "id": 401, "weight": 1.8200000000000001} v 0) v1
    -12> 2014-05-22 11:19:33.849815 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62149 ====  
forward(osd_alive(want up_thru 6383 ha
ve 6401) v1 caps allow profile osd tid 60305 con_features  
8796093022207) to leader v2 ==== 288+0+0 (3476969871 0 0) 0x2a9cb480  
con 0x2361760
    -11> 2014-05-22 11:19:33.849869 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62150 ====  
forward(osd_alive(want up_thru 6402 ha
ve 6402) v1 caps allow profile osd tid 60306 con_features  
8796093022207) to leader v2 ==== 288+0+0 (4190502222 0 0) 0x36f2f80  
con 0x2361760
    -10> 2014-05-22 11:19:33.849892 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62151 ====  
forward(osd_pgtemp(e6402 {2.24=[]} v64
02) v1 caps allow profile osd tid 60307 con_features 8796093022207) to  
leader v2 ==== 313+0+0 (154505712 0 0) 0x2500f00 con 0x2361760
     -9> 2014-05-22 11:19:33.849912 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62152 ====  
forward(osd_pgtemp(e6402 {2.24=[]} v64
02) v1 caps allow profile osd tid 60308 con_features 8796093022207) to  
leader v2 ==== 313+0+0 (2579287395 0 0) 0x29abaa80 con 0x2361760
     -8> 2014-05-22 11:19:33.849954 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== mon.2 130.246.179.191:6789/0 62153 ====  
forward(osd_alive(want up_thru 6402 ha
ve 6402) v1 caps allow profile osd tid 60309 con_features  
8796093022207) to leader v2 ==== 288+0+0 (28071569 0 0) 0x2b1d5f00 con  
0x2361760
     -7> 2014-05-22 11:19:34.063692 7fa7da48f700  5  
mon.gdss514 at 0(leader).paxos(paxos updating c 42420..43081)  
queue_proposal bl 42272 bytes; ctx = 0x29dc0470
     -6> 2014-05-22 11:19:34.063725 7fa7da48f700  5  
mon.gdss514 at 0(leader).paxos(paxos updating c 42420..43081)  
propose_new_value not active; proposal queued
     -5> 2014-05-22 11:19:34.422715 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 <== osd.76 130.246.179.181:6880/10080 10 ====  
mon_subscribe({monmap=3+,osd_pg_crea
tes=0,osdmap=6399}) v2 ==== 69+0+0 (1136537960 0 0) 0x217508c0 con 0x2ac740a0
     -4> 2014-05-22 11:19:34.422782 7fa7d9a8e700  5  
mon.gdss514 at 0(leader).osd e6402 send_incremental [6399..6402] to  
osd.76 130.246.179.181:6880/10080
     -3> 2014-05-22 11:19:34.423020 7fa7d9a8e700  1 --  
130.246.179.122:6789/0 --> osd.76 130.246.179.181:6880/10080 --  
osd_map(6399..6402 src has 5747..6402) v3
  -- ?+0 0x27375c40
     -2> 2014-05-22 11:19:34.649756 7fa7c8848700  2 --  
130.246.179.122:6789/0 >> 130.246.179.88:6880/17558 pipe(0x2b5d2080  
sd=32 :6789 s=2 pgs=1 cs=1 l=1 c=0x2a
d40dc0).reader couldn't read tag, (0) Success
     -1> 2014-05-22 11:19:34.649822 7fa7c8848700  2 --  
130.246.179.122:6789/0 >> 130.246.179.88:6880/17558 pipe(0x2b5d2080  
sd=32 :6789 s=2 pgs=1 cs=1 l=1 c=0x2a
d40dc0).fault (0) Success
      0> 2014-05-22 11:19:35.186091 7fa7d908d700 -1 common/Thread.cc:  
In function 'void Thread::create(size_t)' thread 7fa7d908d700 time  
2014-05-22 11:19:33.914
301
common/Thread.cc: 110: FAILED assert(ret == 0)

  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: (Thread::create(unsigned long)+0x8a) [0x748c9a]
  2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba]
  3: (Accepter::entry()+0x265) [0x863295]
  4: (()+0x79d1) [0x7fa7dfdf89d1]
  5: (clone()+0x6d) [0x7fa7deb33b6d]
  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: /usr/bin/ceph-mon() [0x86b991]
  2: (()+0xf710) [0x7fa7dfe00710]
  3: (gsignal()+0x35) [0x7fa7dea7d925]
  4: (abort()+0x175) [0x7fa7dea7f105]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fa7df337a5d]
  6: (()+0xbcbe6) [0x7fa7df335be6]
  7: (()+0xbcc13) [0x7fa7df335c13]
  8: (()+0xbcd0e) [0x7fa7df335d0e]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x7f2) [0x7a5472]
  10: (Thread::create(unsigned long)+0x8a) [0x748c9a]
  11: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba]
  12: (Accepter::entry()+0x265) [0x863295]
  13: (()+0x79d1) [0x7fa7dfdf89d1]
  14: (clone()+0x6d) [0x7fa7deb33b6d]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.

--- begin dump of recent events ---
     -4> 2014-05-22 11:19:35.237629 7fa7d2da1700  2 --  
130.246.179.122:6789/0 >> 130.246.179.184:6880/30668 pipe(0x2b5d0500  
sd=25 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b2aa0).reader couldn't read tag,  
(0) Success
     -3> 2014-05-22 11:19:35.237684 7fa7d2da1700  2 --  
130.246.179.122:6789/0 >> 130.246.179.184:6880/30668 pipe(0x2b5d0500  
sd=25 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b2aa0).fault (0) Success
     -2> 2014-05-22 11:19:35.349080 7fa7d2341700  2 --  
130.246.179.122:6789/0 >> 130.246.179.187:6895/9386 pipe(0xd256400  
sd=37 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b7a60).reader couldn't read tag,  
(0) Success
     -1> 2014-05-22 11:19:35.349144 7fa7d2341700  2 --  
130.246.179.122:6789/0 >> 130.246.179.187:6895/9386 pipe(0xd256400  
sd=37 :6789 s=2 pgs=1 cs=1 l=1 c=0xd5b7a60).fault (0) Success
      0> 2014-05-22 11:19:35.441249 7fa7d908d700 -1 *** Caught signal  
(Aborted) **
  in thread 7fa7d908d700


  ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
  1: /usr/bin/ceph-mon() [0x86b991]
  2: (()+0xf710) [0x7fa7dfe00710]
  3: (gsignal()+0x35) [0x7fa7dea7d925]
  4: (abort()+0x175) [0x7fa7dea7f105]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fa7df337a5d]
  6: (()+0xbcbe6) [0x7fa7df335be6]
  7: (()+0xbcc13) [0x7fa7df335c13]
  8: (()+0xbcd0e) [0x7fa7df335d0e]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x7f2) [0x7a5472]
  10: (Thread::create(unsigned long)+0x8a) [0x748c9a]
  11: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0x8351ba]
  12: (Accepter::entry()+0x265) [0x863295]
  13: (()+0x79d1) [0x7fa7dfdf89d1]
  14: (clone()+0x6d) [0x7fa7deb33b6d]




But I see some things happening on the system while doing this too:



[root@ ~]# ceph osd pool set ecdata15 pgp_num 4096
set pool 16 pgp_num to 4096
[root@ ~]# ceph status
Traceback (most recent call last):
   File "/usr/bin/ceph", line 830, in <module>
     sys.exit(main())
   File "/usr/bin/ceph", line 590, in main
     conffile=conffile)
   File "/usr/lib/python2.6/site-packages/rados.py", line 198, in __init__
     librados_path = find_library('rados')
   File "/usr/lib64/python2.6/ctypes/util.py", line 209, in find_library
     return _findSoname_ldconfig(name) or _get_soname(_findLib_gcc(name))
   File "/usr/lib64/python2.6/ctypes/util.py", line 203, in  
_findSoname_ldconfig
     os.popen('LANG=C /sbin/ldconfig -p 2>/dev/null').read())
OSError: [Errno 12] Cannot allocate memory
[root@ ~]# lsof | wc
-bash: fork: Cannot allocate memory
[root@ ~]# lsof | wc
   21801  211209 3230028
[root@ ~]# ceph status
^CError connecting to cluster: InterruptedOrTimeoutError
^[[A[root@ ~]# lsof | wc
    2028   17476  190947



And meanwhile the daemons has then been crashed.

I verified the memory never ran out.

Thanks!
Kenneth


> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com


----- End message from Gregory Farnum <greg at inktank.com> -----


-- 

Met vriendelijke groeten,
Kenneth Waegeman



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux