ceph cluster inconsistency?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I was doing some tests with rados bench on a Erasure Coded pool (using  
keyvaluestore-dev objectstore) on 0.83, and I see some strangs things:


[root at ceph001 ~]# ceph status
     cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d
      health HEALTH_WARN too few pgs per osd (4 < min 20)
      monmap e1: 3 mons at  
{ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0}, election epoch 6, quorum 0,1,2  
ceph001,ceph002,ceph003
      mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active}, 2 up:standby
      osdmap e292: 78 osds: 78 up, 78 in
       pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841 kobjects
             1381 GB used, 129 TB / 131 TB avail
                  320 active+clean

There is around 15T of data, but only 1.3 T usage.

This is also visible in rados:

[root at ceph001 ~]# rados df
pool name       category                 KB      objects       clones   
    degraded      unfound           rd        rd KB           wr        
  wr KB
data            -                          0            0            0  
            0           0            0            0            0        
      0
ecdata          -                16113451009      3933959            0  
            0           0            1            1      3935632   
16116850711
metadata        -                          2           20            0  
            0           0           33           36           21        
      8
rbd             -                          0            0            0  
            0           0            0            0            0        
      0
   total used      1448266016      3933979
   total avail   139400181016
   total space   140848447032


Another (related?) thing: if I do rados -p ecdata ls, I trigger osd  
shutdowns (each time):
I get a list followed by an error:

...
benchmark_data_ceph001.cubone.os_8961_object243839
benchmark_data_ceph001.cubone.os_5560_object801983
benchmark_data_ceph001.cubone.os_31461_object856489
benchmark_data_ceph001.cubone.os_8961_object202232
benchmark_data_ceph001.cubone.os_4919_object33199
benchmark_data_ceph001.cubone.os_5560_object807797
benchmark_data_ceph001.cubone.os_4919_object74729
benchmark_data_ceph001.cubone.os_31461_object1264121
benchmark_data_ceph001.cubone.os_5560_object1318513
benchmark_data_ceph001.cubone.os_5560_object1202111
benchmark_data_ceph001.cubone.os_31461_object939107
benchmark_data_ceph001.cubone.os_31461_object729682
benchmark_data_ceph001.cubone.os_5560_object122915
benchmark_data_ceph001.cubone.os_5560_object76521
benchmark_data_ceph001.cubone.os_5560_object113261
benchmark_data_ceph001.cubone.os_31461_object575079
benchmark_data_ceph001.cubone.os_5560_object671042
benchmark_data_ceph001.cubone.os_5560_object381146
2014-08-13 17:57:48.736150 7f65047b5700  0 -- 10.141.8.180:0/1023295  
 >> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1 pgs=0 cs=0  
l=1 c=0x7f64fc019db0).fault

And I can see this in the log files:

    -25> 2014-08-13 17:52:56.323908 7f8a97fa4700  1 --  
10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51 ====  
osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== 47+0+0  
(3227325175 0 0) 0xf475940 con 0xee89fa0
    -24> 2014-08-13 17:52:56.323938 7f8a97fa4700  1 --  
10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 --  
osd_ping(ping_reply e220 stamp 2014-08-13 17:52:56.323092) v2 -- ?+0  
0xf815b00 con 0xee89fa0
    -23> 2014-08-13 17:52:56.324078 7f8a997a7700  1 --  
10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51 ====  
osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== 47+0+0  
(3227325175 0 0) 0xf132bc0 con 0xee8a680
    -22> 2014-08-13 17:52:56.324111 7f8a997a7700  1 --  
10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 --  
osd_ping(ping_reply e220 stamp 2014-08-13 17:52:56.323092) v2 -- ?+0  
0xf811a40 con 0xee8a680
    -21> 2014-08-13 17:52:56.584461 7f8a997a7700  1 --  
10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47 ====  
osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== 47+0+0  
(3355887204 0 0) 0xf655940 con 0xee88b00
    -20> 2014-08-13 17:52:56.584486 7f8a997a7700  1 --  
10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 --  
osd_ping(ping_reply e220 stamp 2014-08-13 17:52:56.583010) v2 -- ?+0  
0xf132bc0 con 0xee88b00
    -19> 2014-08-13 17:52:56.584498 7f8a97fa4700  1 --  
10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47 ====  
osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== 47+0+0  
(3355887204 0 0) 0xf20e040 con 0xee886e0
    -18> 2014-08-13 17:52:56.584526 7f8a97fa4700  1 --  
10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 --  
osd_ping(ping_reply e220 stamp 2014-08-13 17:52:56.583010) v2 -- ?+0  
0xf475940 con 0xee886e0
    -17> 2014-08-13 17:52:56.594448 7f8a798c7700  1 --  
10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839 s=0 pgs=0  
cs=0 l=0 c=0xee856a0).accept sd=74 10.141.8.180:47641/0
    -16> 2014-08-13 17:52:56.594921 7f8a798c7700  1 --  
10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 1 ====  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220) v4 ==== 151+0+39 (1972163119 0  
4174233976) 0xf3bca40 con 0xee856a0
    -15> 2014-08-13 17:52:56.594957 7f8a798c7700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.594874, event: header_read, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
    -14> 2014-08-13 17:52:56.594970 7f8a798c7700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.594880, event: throttled, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
    -13> 2014-08-13 17:52:56.594978 7f8a798c7700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.594917, event: all_read, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
    -12> 2014-08-13 17:52:56.594986 7f8a798c7700  5 -- op tracker -- ,  
seq: 299, time: 0.000000, event: dispatched, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
    -11> 2014-08-13 17:52:56.595127 7f8a90795700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.595104, event: reached_pg, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
    -10> 2014-08-13 17:52:56.595159 7f8a90795700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.595153, event: started, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
     -9> 2014-08-13 17:52:56.602179 7f8a90795700  1 --  
10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 -- osd_op_reply(1   
[pgls start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0  
0xec16180 con 0xee856a0
     -8> 2014-08-13 17:52:56.602211 7f8a90795700  5 -- op tracker -- ,  
seq: 299, time: 2014-08-13 17:52:56.602205, event: done, op:  
osd_op(client.7512.0:1  [pgls start_epoch 0] 3.0  
ack+read+known_if_redirected e220)
     -7> 2014-08-13 17:52:56.614839 7f8a798c7700  1 --  
10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 2 ====  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220) v4 ==== 151+0+89 (3460833343 0  
2600845095) 0xf3bcec0 con 0xee856a0
     -6> 2014-08-13 17:52:56.614864 7f8a798c7700  5 -- op tracker -- ,  
seq: 300, time: 2014-08-13 17:52:56.614789, event: header_read, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
     -5> 2014-08-13 17:52:56.614874 7f8a798c7700  5 -- op tracker -- ,  
seq: 300, time: 2014-08-13 17:52:56.614792, event: throttled, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
     -4> 2014-08-13 17:52:56.614884 7f8a798c7700  5 -- op tracker -- ,  
seq: 300, time: 2014-08-13 17:52:56.614835, event: all_read, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
     -3> 2014-08-13 17:52:56.614891 7f8a798c7700  5 -- op tracker -- ,  
seq: 300, time: 0.000000, event: dispatched, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
     -2> 2014-08-13 17:52:56.614972 7f8a92f9a700  5 -- op tracker -- ,  
seq: 300, time: 2014-08-13 17:52:56.614958, event: reached_pg, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
     -1> 2014-08-13 17:52:56.614993 7f8a92f9a700  5 -- op tracker -- ,  
seq: 300, time: 2014-08-13 17:52:56.614986, event: started, op:  
osd_op(client.7512.0:2  [pgls start_epoch 220] 3.0  
ack+read+known_if_redirected e220)
      0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1  
os/GenericObjectMap.cc: In function 'int  
GenericObjectMap::list_objects(const coll_t&, ghobject_t, int,  
std::vector<ghobject_t>*, ghobject_t*)' thread 7f8a92f9a700 time  
2014-08-13 17:52:56.615073
os/GenericObjectMap.cc: 1118: FAILED assert(start <= header.oid)


  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
  1: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, int,  
std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x474) [0x98f774]
  2: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, int,  
int, snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x274) [0x8c5b54]
  3: (PGBackend::objects_list_partial(hobject_t const&, int, int,  
snapid_t, std::vector<hobject_t, std::allocator<hobject_t> >*,  
hobject_t*)+0x1c9) [0x862de9]
  4: (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) [0x7f67f5]
  5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) [0x8177b3]
  6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,  
ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
  7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,  
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)  
[0x62bf8d]
  8: (OSD::ShardedOpWQ::_process(unsigned int,  
ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
  9: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8cd)  
[0xa776fd]
  10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa79980]
  11: (()+0x7df3) [0x7f8aac71fdf3]
  12: (clone()+0x6d) [0x7f8aab1963dd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.


  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
  1: /usr/bin/ceph-osd() [0x99b466]
  2: (()+0xf130) [0x7f8aac727130]
  3: (gsignal()+0x39) [0x7f8aab0d5989]
  4: (abort()+0x148) [0x7f8aab0d7098]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f8aab9e89d5]
  6: (()+0x5e946) [0x7f8aab9e6946]
  7: (()+0x5e973) [0x7f8aab9e6973]
  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x1ef) [0xa8805f]
  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, int,  
std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x474) [0x98f774]
  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, int,  
int, snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x274) [0x8c5b54]
  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,  
snapid_t, std::vector<hobject_t, std::allocator<hobject_t> >*,  
hobject_t*)+0x1c9) [0x862de9]
  13: (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)  
[0x7f67f5]
  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) [0x8177b3]
  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,  
ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,  
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)  
[0x62bf8d]
  17: (OSD::ShardedOpWQ::_process(unsigned int,  
ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned  
int)+0x8cd) [0xa776fd]
  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa79980]
  20: (()+0x7df3) [0x7f8aac71fdf3]
  21: (clone()+0x6d) [0x7f8aab1963dd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.

--- begin dump of recent events ---
      0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** Caught signal  
(Aborted) **
  in thread 7f8a92f9a700

  ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8)
  1: /usr/bin/ceph-osd() [0x99b466]
  2: (()+0xf130) [0x7f8aac727130]
  3: (gsignal()+0x39) [0x7f8aab0d5989]
  4: (abort()+0x148) [0x7f8aab0d7098]
  5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f8aab9e89d5]
  6: (()+0x5e946) [0x7f8aab9e6946]
  7: (()+0x5e973) [0x7f8aab9e6973]
  8: (()+0x5eb9f) [0x7f8aab9e6b9f]
  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char  
const*)+0x1ef) [0xa8805f]
  10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, int,  
std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x474) [0x98f774]
  11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, int,  
int, snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*,  
ghobject_t*)+0x274) [0x8c5b54]
  12: (PGBackend::objects_list_partial(hobject_t const&, int, int,  
snapid_t, std::vector<hobject_t, std::allocator<hobject_t> >*,  
hobject_t*)+0x1c9) [0x862de9]
  13: (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5)  
[0x7f67f5]
  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) [0x8177b3]
  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,  
ThreadPool::TPHandle&)+0x5d5) [0x7b8045]
  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,  
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d)  
[0x62bf8d]
  17: (OSD::ShardedOpWQ::_process(unsigned int,  
ceph::heartbeat_handle_d*)+0x35c) [0x62c56c]
  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned  
int)+0x8cd) [0xa776fd]
  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa79980]
  20: (()+0x7df3) [0x7f8aac71fdf3]
  21: (clone()+0x6d) [0x7f8aab1963dd]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is  
needed to interpret this.

I guess this has something to do with using the dev Keyvaluestore?


Thanks!

Kenneth



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux