Hello,
probably I have restarted osd too many times or I have put in/out osd too many times but now I get this:
root@proxmox-zotac:~# /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f
starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-03-09 00:08:09.196669 7f7fd358e880 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
0> 2016-03-09 00:08:09.196669 7f7fd358e880 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
2016-03-09 00:08:09.203630 7f7fd358e880 -1 *** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
0> 2016-03-09 00:08:09.203630 7f7fd358e880 -1 *** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-03-09 00:08:09.196669 7f7fd358e880 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
0> 2016-03-09 00:08:09.196669 7f7fd358e880 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f7fd358e880 time 2016-03-09 00:08:09.193975
osd/PG.cc: 2868: FAILED assert(r > 0)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0xc03c46]
2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
3: (OSD::load_pgs()+0xa20) [0x6a9170]
4: (OSD::init()+0xc84) [0x6ac204]
5: (main()+0x2839) [0x632459]
6: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
7: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
2016-03-09 00:08:09.203630 7f7fd358e880 -1 *** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
0> 2016-03-09 00:08:09.203630 7f7fd358e880 -1 *** Caught signal (Aborted) **
in thread 7f7fd358e880
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xb04503]
2: (()+0xf8d0) [0x7f7fd24268d0]
3: (gsignal()+0x37) [0x7f7fd08c7067]
4: (abort()+0x148) [0x7f7fd08c8448]
5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f7fd11b4b3d]
6: (()+0x5ebb6) [0x7f7fd11b2bb6]
7: (()+0x5ec01) [0x7f7fd11b2c01]
8: (()+0x5ee19) [0x7f7fd11b2e19]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x247) [0xc03e17]
10: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x4ab) [0x7c616b]
11: (OSD::load_pgs()+0xa20) [0x6a9170]
12: (OSD::init()+0xc84) [0x6ac204]
13: (main()+0x2839) [0x632459]
14: (__libc_start_main()+0xf5) [0x7f7fd08b3b45]
15: /usr/bin/ceph-osd() [0x64c087]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-03-02 9:38 GMT+01:00 Mario Giammarco <mgiammarco@xxxxxxxxx>:
Here it is:
cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
health HEALTH_WARN
4 pgs incomplete
4 pgs stuck inactive
4 pgs stuck unclean
1 requests are blocked > 32 sec
monmap e8: 3 mons at {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
election epoch 840, quorum 0,1,2 0,1,2
osdmap e2405: 3 osds: 3 up, 3 in
pgmap v5904430: 288 pgs, 4 pools, 391 GB data, 100 kobjects
1090 GB used, 4481 GB / 5571 GB avail
284 active+clean
4 incomplete
client io 4008 B/s rd, 446 kB/s wr, 23 op/s2016-03-02 9:31 GMT+01:00 Shinobu Kinjo <skinjo@xxxxxxxxxx>:Is "ceph -s" still showing you same output?
> cluster ac7bc476-3a02-453d-8e5c-606ab6f022ca
> health HEALTH_WARN
> 4 pgs incomplete
> 4 pgs stuck inactive
> 4 pgs stuck unclean
> monmap e8: 3 mons at
> {0=10.1.0.12:6789/0,1=10.1.0.14:6789/0,2=10.1.0.17:6789/0}
> election epoch 832, quorum 0,1,2 0,1,2
> osdmap e2400: 3 osds: 3 up, 3 in
> pgmap v5883297: 288 pgs, 4 pools, 391 GB data, 100 kobjects
> 1090 GB used, 4481 GB / 5571 GB avail
> 284 active+clean
> 4 incomplete
Cheers,
S
----- Original Message -----
From: "Mario Giammarco" <mgiammarco@xxxxxxxxx>
To: "Lionel Bouton" <lionel-subscription@xxxxxxxxxxx>
Cc: "Shinobu Kinjo" <skinjo@xxxxxxxxxx>, ceph-users@xxxxxxxxxxxxxx
Sent: Wednesday, March 2, 2016 4:27:15 PM
Subject: Re: Help: pool not responding
Tried to set min_size=1 but unfortunately nothing has changed.
Thanks for the idea.
2016-02-29 22:56 GMT+01:00 Lionel Bouton <lionel-subscription@xxxxxxxxxxx>:
> Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
>
> the fact that they are optimized for benchmarks and certainly not
> Ceph OSD usage patterns (with or without internal journal).
>
> Are you assuming that SSHD is causing the issue?
> If you could elaborate on this more, it would be helpful.
>
>
> Probably not (unless they reveal themselves extremely unreliable with Ceph
> OSD usage patterns which would be surprising to me).
>
> For incomplete PG the documentation seems good enough for what should be
> done :
> http://docs.ceph.com/docs/master/rados/operations/pg-states/
>
> The relevant text:
>
> *Incomplete* Ceph detects that a placement group is missing information
> about writes that may have occurred, or does not have any healthy copies.
> If you see this state, try to start any failed OSDs that may contain the
> needed information or temporarily adjust min_size to allow recovery.
>
> We don't have the full history but the most probable cause of these
> incomplete PGs is that min_size is set to 2 or 3 and at some time the 4
> incomplete pgs didn't have as many replica as the min_size value. So if
> setting min_size to 2 isn't enough setting it to 1 should unfreeze them.
>
> Lionel
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com