Re: Another OSD broken today. How can I recover it?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Things are going worse every day.


ceph -w
    cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
     health HEALTH_ERR
            1 pgs are stuck inactive for more than 300 seconds
            8 pgs inconsistent
            1 pgs repair
            1 pgs stale
            1 pgs stuck stale
            recovery 20266198323167232/288980 objects degraded (7013010700798.405%)
            37154696925806624 scrub errors
            no legacy OSD present but 'sortbitwise' flag is not set


But I'm finally finding time to recover. The disk seems to be correct, no smart errors and everything looks fine just ceph not starting. Today I started to look for the ceph-objectstore-tool. That I don't really know much.

It just works nice. No crash as expected like on the OSD.

So I'm lost. Since both OSD and ceph objectstore tool use same backend how is this posible?

Can someone help me on fixing this, please?



----------------------------------------------------------------------------------

ceph-objectstore-tool --debug --op list-pgs --data-path /var/lib/ceph/osd/ceph-4 --journal-path /dev/sdf3
2017-12-03 13:27:58.206069 7f02c203aa40  0 filestore(/var/lib/ceph/osd/ceph-4) backend xfs (magic 0x58465342)
2017-12-03 13:27:58.206528 7f02c203aa40  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2017-12-03 13:27:58.206546 7f02c203aa40  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2017-12-03 13:27:58.206569 7f02c203aa40  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: splice is supported
2017-12-03 13:27:58.251393 7f02c203aa40  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2017-12-03 13:27:58.251459 7f02c203aa40  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature: extsize is disabled by conf
2017-12-03 13:27:58.978809 7f02c203aa40  0 filestore(/var/lib/ceph/osd/ceph-4) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2017-12-03 13:27:58.990051 7f02c203aa40  1 journal _open /dev/sdf3 fd 11: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-12-03 13:27:59.002345 7f02c203aa40  1 journal _open /dev/sdf3 fd 11: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-12-03 13:27:59.004846 7f02c203aa40  1 filestore(/var/lib/ceph/osd/ceph-4) upgrade
Cluster fsid=9028f4da-0d77-462b-be9b-dbdf7fa57771
Supported features: compat={},rocompat={},incompat={1=initial feature set(~v.18),2=pginfo object,3=object locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper,11=sharded objects,12=transaction hints,13=pg meta object}
On-disk features: compat={},rocompat={},incompat={1=initial feature set(~v.18),2=pginfo object,3=object locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper,11=sharded objects,12=transaction hints,13=pg meta object}
Performing list-pgs operation
11.7f
10.4b
....
10.8d
2017-12-03 13:27:59.009327 7f02c203aa40  1 journal close /dev/sdf3




It looks like the problem has something to do with map. cause there's an assertion that's failing on size.

Can this have something to do with the fact I got this from map?

      pgmap v71223952: 764 pgs, 6 pools, 561 GB data, 141 kobjects
            1124 GB used, 1514 GB / 2639 GB avail
            20266198323167232/288980 objects degraded (7013010700798.405%)

This is the current crash from the command line.

starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /var/lib/ceph/osd/ceph-4/journal
osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f467ba0b8c0 time 2017-12-03 13:39:29.495311
osd/PG.cc: 3025: FAILED assert(values.size() == 2)
 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x5556eab28790]
 2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, ceph::buffer::list*)+0x661) [0x5556ea4e6601]
 3: (OSD::load_pgs()+0x75a) [0x5556ea43a8aa]
 4: (OSD::init()+0x2026) [0x5556ea445ca6]
 5: (main()+0x2ef1) [0x5556ea3b7301]
 6: (__libc_start_main()+0xf0) [0x7f467886b830]
 7: (_start()+0x29) [0x5556ea3f8b09]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, ceph::bufferlist*)' thread 7f467ba0b8c0 time 2017-12-03 13:39:29.495311
osd/PG.cc: 3025: FAILED assert(values.size() == 2)


So it looks like the offending code is this one:

  int r = store->omap_get_values(coll, pgmeta_oid, keys, &values);
  if (r == 0) {
    assert(values.size() == 2);     <------ Here

    // sanity check version

How can this values be different of 2. Can this have something to do with the map values showing in ceph?

      pgmap v71223952: 764 pgs, 6 pools, 561 GB data, 141 kobjects
            1124 GB used, 1514 GB / 2639 GB avail
            20266198323167232/288980 objects degraded (7013010700798.405%)

Best regards



On 03/12/17 13:31, Gonzalo Aguilar Delgado wrote:

Hi,

Yes. Nice. Until all your OSD fails and you don't know what else to try. Looking at the faillure rates it will happen very soon.

I want to recover them. I'm writing in another mail what I tried. Let see if someone can help me.

I'm not doing anything. Just looking at my cluster from time to time to find that something else failed. I will do hard to recover this situation.

Thank you.


On 26/11/17 16:13, Marc Roos wrote:
 
If I am not mistaken, the whole idea with the 3 replica's is dat you 
have enough copies to recover from a failed osd. In my tests this seems 
to go fine automatically. Are you doing something that is not adviced?




-----Original Message-----
From: Gonzalo Aguilar Delgado [mailto:gaguilar@xxxxxxxxxxxxxxxxxx] 
Sent: zaterdag 25 november 2017 20:44
To: 'ceph-users'
Subject:  Another OSD broken today. How can I recover it?

Hello, 


I had another blackout with ceph today. It seems that ceph osd's fall 
from time to time and they are unable to recover. I have 3 OSD's down 
now. 1 removed from the cluster and 2 down because I'm unable to recover 
them. 


We really need a recovery tool. It's not normal that an OSD breaks and 
there's no way to recover. Is there any way to do it?


Last one shows this:




] enter Reset
   -12> 2017-11-25 20:34:19.548891 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34(unlocked)] enter Initial
   -11> 2017-11-25 20:34:19.548983 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34( empty local-les=9685 n=0 ec=404 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
exit Initial 0.000091 0 0.000000
   -10> 2017-11-25 20:34:19.548994 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34( empty local-les=9685 n=0 ec=404 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
enter Reset
    -9> 2017-11-25 20:34:19.549166 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36(unlocked)] enter Initial
    -8> 2017-11-25 20:34:19.566781 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36( v 9686'7301894 (9686'7298879,9686'7301894] local-les=9685 
n=534 ec=419 les/c/f 9685/9686/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'7301894 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 
0.017614 0 0.000000
    -7> 2017-11-25 20:34:19.566811 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36( v 9686'7301894 (9686'7298879,9686'7301894] local-les=9685 
n=534 ec=419 les/c/f 9685/9686/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'7301894 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset
    -6> 2017-11-25 20:34:19.585411 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c(unlocked)] enter Initial
    -5> 2017-11-25 20:34:19.602888 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c( empty local-les=9685 n=0 ec=348 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
exit Initial 0.017478 0 0.000000
    -4> 2017-11-25 20:34:19.602912 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c( empty local-les=9685 n=0 ec=348 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
enter Reset
    -3> 2017-11-25 20:34:19.603082 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10(unlocked)] enter Initial
    -2> 2017-11-25 20:34:19.615456 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10( v 9686'2322547 (9031'2319518,9686'2322547] local-les=9685 n=261 
ec=417 les/c/f 9685/9685/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'2322547 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 
0.012373 0 0.000000
    -1> 2017-11-25 20:34:19.615481 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10( v 9686'2322547 (9031'2319518,9686'2322547] local-les=9685 n=261 
ec=417 les/c/f 9685/9685/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'2322547 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset
     0> 2017-11-25 20:34:19.617400 7f6e5dc158c0 -1 osd/PG.cc: In 
function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, 
ceph::bufferlist*)' thread 7f6e5dc158c0 time 2017-11-25 20:34:19.615633
osd/PG.cc: 3025: FAILED assert(values.size() == 2)

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x80) [0x5562d318d790]
 2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x661) [0x5562d2b4b601]
 3: (OSD::load_pgs()+0x75a) [0x5562d2a9f8aa]
 4: (OSD::init()+0x2026) [0x5562d2aaaca6]
 5: (main()+0x2ef1) [0x5562d2a1c301]
 6: (__libc_start_main()+0xf0) [0x7f6e5aa75830]
 7: (_start()+0x29) [0x5562d2a5db09]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 newstore
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   1/ 5 kinetic
   1/ 5 fuse
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.4.log
--- end dump of recent events ---
2017-11-25 20:34:19.622559 7f6e5dc158c0 -1 *** Caught signal (Aborted) 
**  in thread 7f6e5dc158c0 thread_name:ceph-osd

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (()+0x98653e) [0x5562d308d53e]
 2: (()+0x11390) [0x7f6e5caee390]
 3: (gsignal()+0x38) [0x7f6e5aa8a428]
 4: (abort()+0x16a) [0x7f6e5aa8c02a]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x26b) [0x5562d318d97b]
 6: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x661) [0x5562d2b4b601]
 7: (OSD::load_pgs()+0x75a) [0x5562d2a9f8aa]
 8: (OSD::init()+0x2026) [0x5562d2aaaca6]
 9: (main()+0x2ef1) [0x5562d2a1c301]
 10: (__libc_start_main()+0xf0) [0x7f6e5aa75830]
 11: (_start()+0x29) [0x5562d2a5db09]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- begin dump of recent events ---
     0> 2017-11-25 20:34:19.622559 7f6e5dc158c0 -1 *** Caught signal 
(Aborted) **  in thread 7f6e5dc158c0 thread_name:ceph-osd

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (()+0x98653e) [0x5562d308d53e]
 2: (()+0x11390) [0x7f6e5caee390]
 3: (gsignal()+0x38) [0x7f6e5aa8a428]
 4: (abort()+0x16a) [0x7f6e5aa8c02a]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x26b) [0x5562d318d97b]
 6: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x661) [0x5562d2b4b601]
 7: (OSD::load_pgs()+0x75a) [0x5562d2a9f8aa]
 8: (OSD::init()+0x2026) [0x5562d2aaaca6]
 9: (main()+0x2ef1) [0x5562d2a1c301]
 10: (__libc_start_main()+0xf0) [0x7f6e5aa75830]
 11: (_start()+0x29) [0x5562d2a5db09]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 newstore
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   1/ 5 kinetic
   1/ 5 fuse
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.4.log












_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux