Re: ceph pg query hangs for ever

Wido den Hollander <wido@xxxxxxxx> · Fri, 1 Apr 2016 14:48:20 +0200 (CEST)

> Op 1 april 2016 om 1:28 schreef Goncalo Borges <goncalo.borges@xxxxxxxxxxxxx>:
> 
> 
> Hi Mart, Wido...
> 
> A disclaimer: Not really an expert, just a regular site admin sharing my 
> experience.
> 

Thanks!

> At the beginning of the thread you give the idea that only osd.68 has 
> problems dealing with the problematic PG 3.117. If that is indeed the 
> case, you could simply mark that osd.68 down and remove it from the 
> cluster. This will trigger Ceph to replicate all PGs in osd.68 to other 
> osds based on other PG replicas.
> 
> However, In the last email, you seem to give the idea that it is the PG 
> 3.117 which has problems, which makes all osds which share that PG also 
> problematic.  Because of that you marked all osds sharing that PG as down.
> 
> 
> Before actually trying something more drastic, I would go for a more 
> classic approach. For example, what happens if you turn only one osd up? 
> I would start with osd.74 since you suspect of problems in osd.68 and 
> osd.55 was the reason for the dump message bellow. If it still aborts 
> than it means that the PG might have been replicated everywhere with 
> 'bad' data.
> 
> The drastic approach (If you do not care about data on that PG), is to 
> mark those osds has down, and force that PG to be recreated using 'ceph 
> pg force_create_pg 3.117'. Based on my previous experience, once I've 
> recreated a PG, 'ceph pg dump_stuck stale' showed that PG is creating 
> state forever. To make it right, I had to restart the proper osds. But, 
> as you stated, you then have to deal with data corruption at the VMs 
> level... Maybe that is a problem, maybe it isn't...
> 

Eventually we were able to get things running again. Marking a few OSDs down
helped to get things running again, but other started to crash.

Somehow the PG got corrupted on one of the OSDs and it kept crashing on a single
object.

After removing the object with ceph-objectstore-tool from the OSD's their data
directory we were able to start the OSDs again and have the PG migrate away.

We had to restore one RBD image from a backup since it was missing 4MB in the
filesystem.

Wido

> Hope that helps
> Cheers
> Goncalo
> 
> 
> 
> 
> On 03/31/2016 12:26 PM, Mart van Santen wrote:
> >
> >
> > Hello,
> >
> > Well unfortunately the problem is not really solved. Yes, we managed 
> > to get to a good health state at some point, when a client hits some 
> > specific data, the osd process crashes with below errors. The 3 OSD 
> > which handle 3.117, the PG with problems, are currently down and 
> > reweighted them to 0, so non-affected PGs are currently rebuild on 
> > other OSDs
> > If I put them crashed osd up, the do crash again within a few minutes.
> >
> > As I'm a bit afraid for the data in this PG, I think we want to 
> > recreate the PG with empty data and discard the old disks. I 
> > understand I will get datacorruption on serveral RBDs in this case, 
> > but we will try to solve that and rebuild the affected VMs. Does this 
> > makes sense and what are the best next steps?
> >
> > Regards,
> >
> > Mart
> >
> > works
> >
> >
> >
> >
> >    -34> 2016-03-31 03:07:56.932800 7f8e43829700  3 osd.55 122203 
> > handle_osd_map epochs [122203,122203], i have 122203, src has 
> > [120245,122203]
> >    -33> 2016-03-31 03:07:56.932837 7f8e43829700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 <== osd.45 
> > [2a00:c6c0:0:122::103]:6800/1852 7 ==== pg_info(1 pgs e122202:3.117) 
> > v4 ==== 919+0+0 (3389909573 0 0) 0x528bc00 con 0x1200a840
> >    -32> 2016-03-31 03:07:56.932855 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932770, event: header_read, op: 
> > pg_info(1 pgs e122202:3.117)
> >    -31> 2016-03-31 03:07:56.932869 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932771, event: throttled, op: 
> > pg_info(1 pgs e122202:3.117)
> >    -30> 2016-03-31 03:07:56.932878 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932822, event: all_read, op: 
> > pg_info(1 pgs e122202:3.117)
> >    -29> 2016-03-31 03:07:56.932886 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932851, event: dispatched, op: 
> > pg_info(1 pgs e122202:3.117)
> >    -28> 2016-03-31 03:07:56.932895 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932895, event: waiting_for_osdmap, 
> > op: pg_info(1 pgs e122202:3.117)
> >    -27> 2016-03-31 03:07:56.932912 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932912, event: started, op: 
> > pg_info(1 pgs e122202:3.117)
> >    -26> 2016-03-31 03:07:56.932947 7f8e43829700  5 -- op tracker -- 
> > seq: 22, time: 2016-03-31 03:07:56.932947, event: done, op: pg_info(1 
> > pgs e122202:3.117)
> >    -25> 2016-03-31 03:07:56.933022 7f8e3c01a700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::103]:6800/1852 
> > -- osd_map(122203..122203 src has 121489..122203) v3 -- ?+0 0x11c7fd40 
> > con 0x1200a840
> >    -24> 2016-03-31 03:07:56.933041 7f8e3c01a700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::103]:6800/1852 
> > -- pg_info(1 pgs e122203:3.117) v4 -- ?+0 0x528bde0 con 0x1200a840
> >    -23> 2016-03-31 03:07:56.933111 7f8e3c01a700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::105]:6810/3568 
> > -- osd_map(122203..122203 src has 121489..122203) v3 -- ?+0 0x12200d00 
> > con 0x1209d4a0
> >    -22> 2016-03-31 03:07:56.933125 7f8e3c01a700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::105]:6810/3568 
> > -- pg_info(1 pgs e122203:3.117) v4 -- ?+0 0x5288960 con 0x1209d4a0
> >    -21> 2016-03-31 03:07:56.933154 7f8e3c01a700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> 
> > [2a00:c6c0:0:122::108]:6816/1002847 -- pg_info(1 pgs e122203:3.117) v4 
> > -- ?+0 0x5288d20 con 0x101a19c0
> >    -20> 2016-03-31 03:07:56.933212 7f8e3c01a700  5 osd.55 pg_epoch: 
> > 122203 pg[3.117( v 122193'1898519 (108032'1895437,122193'1898519] 
> > local-les=122202 n=2789 ec=23736 les/c 122202/122047 
> > 122062/122201/122201) [72,54,45]/[55] r=0 lpr=122201 
> > pi=122046-122200/51 bft=45,54,72 crt=122133'1898514 lcod 0'0 mlcod 0'0 
> > active+undersized+degraded+remapped] on activate: bft=45,54,72 from 
> > 0//0//-1
> >    -19> 2016-03-31 03:07:56.933232 7f8e3c01a700  5 osd.55 pg_epoch: 
> > 122203 pg[3.117( v 122193'1898519 (108032'1895437,122193'1898519] 
> > local-les=122202 n=2789 ec=23736 les/c 122202/122047 
> > 122062/122201/122201) [72,54,45]/[55] r=0 lpr=122201 
> > pi=122046-122200/51 bft=45,54,72 crt=122133'1898514 lcod 0'0 mlcod 0'0 
> > active+undersized+degraded+remapped] target shard 45 from 0//0//-1
> >    -18> 2016-03-31 03:07:56.933244 7f8e3c01a700  5 osd.55 pg_epoch: 
> > 122203 pg[3.117( v 122193'1898519 (108032'1895437,122193'1898519] 
> > local-les=122202 n=2789 ec=23736 les/c 122202/122047 
> > 122062/122201/122201) [72,54,45]/[55] r=0 lpr=122201 
> > pi=122046-122200/51 bft=45,54,72 crt=122133'1898514 lcod 0'0 mlcod 0'0 
> > active+undersized+degraded+remapped] target shard 54 from 0//0//-1
> >    -17> 2016-03-31 03:07:56.933255 7f8e3c01a700  5 osd.55 pg_epoch: 
> > 122203 pg[3.117( v 122193'1898519 (108032'1895437,122193'1898519] 
> > local-les=122202 n=2789 ec=23736 les/c 122202/122047 
> > 122062/122201/122201) [72,54,45]/[55] r=0 lpr=122201 
> > pi=122046-122200/51 bft=45,54,72 crt=122133'1898514 lcod 0'0 mlcod 0'0 
> > active+undersized+degraded+remapped] target shard 72 from 0//0//-1
> >    -16> 2016-03-31 03:07:56.933283 7f8e3680f700  5 -- op tracker -- 
> > seq: 20, time: 2016-03-31 03:07:56.933283, event: reached_pg, op: 
> > osd_op(client.776466.1:190178605 
> > rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint object_size 
> > 4194304 write_size 4194304,write 8192~8192] 3.b1492517 RETRY=1 snapc 
> > 8b3=[8b3] ondisk+retry+write e122203)
> >    -15> 2016-03-31 03:07:56.933507 7f8e3680f700  5 -- op tracker -- 
> > seq: 20, time: 2016-03-31 03:07:56.933507, event: started, op: 
> > osd_op(client.776466.1:190178605 
> > rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint object_size 
> > 4194304 write_size 4194304,write 8192~8192] 3.b1492517 RETRY=1 snapc 
> > 8b3=[8b3] ondisk+retry+write e122203)
> >    -14> 2016-03-31 03:07:56.933648 7f8e3680f700  5 -- op tracker -- 
> > seq: 20, time: 2016-03-31 03:07:56.933648, event: waiting for subops 
> > from 45,54,72, op: osd_op(client.776466.1:190178605 
> > rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint object_size 
> > 4194304 write_size 4194304,write 8192~8192] 3.b1492517 RETRY=1 snapc 
> > 8b3=[8b3] ondisk+retry+write e122203)
> >    -13> 2016-03-31 03:07:56.933682 7f8e3680f700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::103]:6800/1852 
> > -- osd_repop(client.776466.1:190178605 3.117 
> > b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v 
> > 122203'1898521) v1 -- ?+46 0x11e96400 con 0x1200a840
> >    -12> 2016-03-31 03:07:56.933712 7f8e3680f700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> [2a00:c6c0:0:122::105]:6810/3568 
> > -- osd_repop(client.776466.1:190178605 3.117 
> > b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v 
> > 122203'1898521) v1 -- ?+46 0x11e96a00 con 0x1209d4a0
> >    -11> 2016-03-31 03:07:56.933735 7f8e3680f700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 --> 
> > [2a00:c6c0:0:122::108]:6816/1002847 -- 
> > osd_repop(client.776466.1:190178605 3.117 
> > b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v 
> > 122203'1898521) v1 -- ?+46 0x11e97600 con 0x101a19c0
> >    -10> 2016-03-31 03:07:56.935173 7f8e30ef5700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 <== osd.72 
> > [2a00:c6c0:0:122::108]:6816/1002847 9 ==== 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0) v1 
> > ==== 83+0+0 (405786713 0 0) 0x11e66d00 con 0x101a19c0
> >     -9> 2016-03-31 03:07:56.935212 7f8e30ef5700  5 -- op tracker -- 
> > seq: 23, time: 2016-03-31 03:07:56.935087, event: header_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -8> 2016-03-31 03:07:56.935224 7f8e30ef5700  5 -- op tracker -- 
> > seq: 23, time: 2016-03-31 03:07:56.935090, event: throttled, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -7> 2016-03-31 03:07:56.935234 7f8e30ef5700  5 -- op tracker -- 
> > seq: 23, time: 2016-03-31 03:07:56.935162, event: all_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -6> 2016-03-31 03:07:56.935245 7f8e30ef5700  5 -- op tracker -- 
> > seq: 23, time: 0.000000, event: dispatched, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -5> 2016-03-31 03:07:56.936129 7f8e2dfc6700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 <== osd.45 
> > [2a00:c6c0:0:122::103]:6800/1852 8 ==== 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0) v1 
> > ==== 83+0+0 (3967999676 0 0) 0x11c7fd40 con 0x1200a840
> >     -4> 2016-03-31 03:07:56.936150 7f8e2dfc6700  5 -- op tracker -- 
> > seq: 24, time: 2016-03-31 03:07:56.936086, event: header_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -3> 2016-03-31 03:07:56.936159 7f8e2dfc6700  5 -- op tracker -- 
> > seq: 24, time: 2016-03-31 03:07:56.936087, event: throttled, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -2> 2016-03-31 03:07:56.936166 7f8e2dfc6700  5 -- op tracker -- 
> > seq: 24, time: 2016-03-31 03:07:56.936124, event: all_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -1> 2016-03-31 03:07:56.936172 7f8e2dfc6700  5 -- op tracker -- 
> > seq: 24, time: 0.000000, event: dispatched, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >      0> 2016-03-31 03:07:56.940165 7f8e3680f700 -1 osd/SnapMapper.cc: 
> > In function 'void SnapMapper::add_oid(const hobject_t&, const 
> > std::set<snapid_t>&, MapCacher::Transaction<std::basic_string<char>, 
> > ceph::buffer::list>*)' thread 7f8e3680f700 time 2016-03-31 03:07:56.933983
> > osd/SnapMapper.cc: 228: FAILED assert(r == -2)
> >
> >  ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> > const*)+0x8b) [0xba8b8b]
> >  2: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, 
> > std::less<snapid_t>, std::allocator<snapid_t> > const&, 
> > MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x61e) 
> > [0x72137e]
> >  3: (PG::update_snap_map(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > ObjectStore::Transaction&)+0x402) [0x7d25c2]
> >  4: (PG::append_log(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t, 
> > ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
> >  5: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, eversion_t const&, eversion_t 
> > const&, bool, ObjectStore::Transaction*)+0xba) [0x899eca]
> >  6: (ReplicatedBackend::submit_transaction(hobject_t const&, 
> > eversion_t const&, PGBackend::PGTransaction*, eversion_t const&, 
> > eversion_t const&, std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, 
> > unsigned long, osd_reqid_t, std::tr1::shared_ptr<OpRequest>)+0x77c) 
> > [0x9f06cc]
> >  7: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa) [0x8391aa]
> >  8: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd) [0x88792d]
> >  9: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559) 
> > [0x88cee9]
> >  10: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
> > ThreadPool::TPHandle&)+0x66a) [0x82702a]
> >  11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
> >  12: (OSD::ShardedOpWQ::_process(unsigned int, 
> > ceph::heartbeat_handle_d*)+0x338) [0x696708]
> >  13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x875) 
> > [0xb98555]
> >  14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb9a670]
> >  15: (()+0x8182) [0x7f8e57f6c182]
> >  16: (clone()+0x6d) [0x7f8e564d747d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
> > needed to interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 1 lockdep
> >    0/ 1 context
> >    1/ 1 crush
> >    1/ 5 mds
> >    1/ 5 mds_balancer
> >    1/ 5 mds_locker
> >    1/ 5 mds_log
> >    1/ 5 mds_log_expire
> >    1/ 5 mds_migrator
> >    0/ 1 buffer
> >    0/ 1 timer
> >    0/ 1 filer
> >    0/ 1 striper
> >    0/ 1 objecter
> >    0/ 5 rados
> >    0/ 5 rbd
> >    0/ 5 rbd_replay
> >    0/ 5 journaler
> >    0/ 5 objectcacher
> >    0/ 5 client
> >    0/ 5 osd
> >    0/ 5 optracker
> >    0/ 5 objclass
> >    1/ 3 filestore
> >    1/ 3 keyvaluestore
> >    1/ 3 journal
> >    0/ 5 ms
> >    1/ 5 mon
> >    0/10 monc
> >    1/ 5 paxos
> >    0/ 5 tp
> >    1/ 5 auth
> >    1/ 5 crypto
> >    1/ 1 finisher
> >    1/ 5 heartbeatmap
> >    1/ 5 perfcounter
> >    1/ 5 rgw
> >    1/10 civetweb
> >    1/ 5 javaclient
> >    1/ 5 asok
> >    1/ 1 throttle
> >    0/ 0 refs
> >    1/ 5 xio
> >   -2/-2 (syslog threshold)
> >   -1/-1 (stderr threshold)
> >   max_recent     10000
> >   max_new         1000
> >   log_file /var/log/ceph/ceph-osd.55.log
> > --- end dump of recent events ---
> > 2016-03-31 03:07:56.960104 7f8e3680f700 -1 *** Caught signal (Aborted) **
> >  in thread 7f8e3680f700
> >
> >  ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
> >  1: /usr/bin/ceph-osd() [0xaaff6a]
> >  2: (()+0x10340) [0x7f8e57f74340]
> >  3: (gsignal()+0x39) [0x7f8e56413cc9]
> >  4: (abort()+0x148) [0x7f8e564170d8]
> >  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f8e56d1e535]
> >  6: (()+0x5e6d6) [0x7f8e56d1c6d6]
> >  7: (()+0x5e703) [0x7f8e56d1c703]
> >  8: (()+0x5e922) [0x7f8e56d1c922]
> >  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> > const*)+0x278) [0xba8d78]
> >  10: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, 
> > std::less<snapid_t>, std::allocator<snapid_t> > const&, 
> > MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x61e) 
> > [0x72137e]
> >  11: (PG::update_snap_map(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > ObjectStore::Transaction&)+0x402) [0x7d25c2]
> >  12: (PG::append_log(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t, 
> > ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
> >  13: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, eversion_t const&, eversion_t 
> > const&, bool, ObjectStore::Transaction*)+0xba) [0x899eca]
> >  14: (ReplicatedBackend::submit_transaction(hobject_t const&, 
> > eversion_t const&, PGBackend::PGTransaction*, eversion_t const&, 
> > eversion_t const&, std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, 
> > unsigned long, osd_reqid_t, std::tr1::shared_ptr<OpRequest>)+0x77c) 
> > [0x9f06cc]
> >  15: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa) 
> > [0x8391aa]
> >  16: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd) 
> > [0x88792d]
> >  17: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559) 
> > [0x88cee9]
> >  18: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
> > ThreadPool::TPHandle&)+0x66a) [0x82702a]
> >  19: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
> >  20: (OSD::ShardedOpWQ::_process(unsigned int, 
> > ceph::heartbeat_handle_d*)+0x338) [0x696708]
> >  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x875) 
> > [0xb98555]
> >  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb9a670]
> >  23: (()+0x8182) [0x7f8e57f6c182]
> >  24: (clone()+0x6d) [0x7f8e564d747d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
> > needed to interpret this.
> >
> > --- begin dump of recent events ---
> >     -7> 2016-03-31 03:07:56.945486 7f8e43829700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 <== osd.54 
> > [2a00:c6c0:0:122::105]:6810/3568 7 ==== osd_map(122203..122203 src has 
> > 121489..122203) v3 ==== 222+0+0 (2966331141 0 0) 0x12200d00 con 0x1209d4a0
> >     -6> 2016-03-31 03:07:56.945514 7f8e43829700  3 osd.55 122203 
> > handle_osd_map epochs [122203,122203], i have 122203, src has 
> > [121489,122203]
> >     -5> 2016-03-31 03:07:56.945517 7f8e2d6bd700  1 -- 
> > [2a00:c6c0:0:122::105]:6822/11703 <== osd.54 
> > [2a00:c6c0:0:122::105]:6810/3568 8 ==== 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0) v1 
> > ==== 83+0+0 (4008969226 0 0) 0x11e661c0 con 0x1209d4a0
> >     -4> 2016-03-31 03:07:56.945538 7f8e2d6bd700  5 -- op tracker -- 
> > seq: 25, time: 2016-03-31 03:07:56.945488, event: header_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -3> 2016-03-31 03:07:56.945545 7f8e2d6bd700  5 -- op tracker -- 
> > seq: 25, time: 2016-03-31 03:07:56.945489, event: throttled, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -2> 2016-03-31 03:07:56.945549 7f8e2d6bd700  5 -- op tracker -- 
> > seq: 25, time: 2016-03-31 03:07:56.945512, event: all_read, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >     -1> 2016-03-31 03:07:56.945552 7f8e2d6bd700  5 -- op tracker -- 
> > seq: 25, time: 0.000000, event: dispatched, op: 
> > osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result = 0)
> >      0> 2016-03-31 03:07:56.960104 7f8e3680f700 -1 *** Caught signal 
> > (Aborted) **
> >  in thread 7f8e3680f700
> >
> >  ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
> >  1: /usr/bin/ceph-osd() [0xaaff6a]
> >  2: (()+0x10340) [0x7f8e57f74340]
> >  3: (gsignal()+0x39) [0x7f8e56413cc9]
> >  4: (abort()+0x148) [0x7f8e564170d8]
> >  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f8e56d1e535]
> >  6: (()+0x5e6d6) [0x7f8e56d1c6d6]
> >  7: (()+0x5e703) [0x7f8e56d1c703]
> >  8: (()+0x5e922) [0x7f8e56d1c922]
> >  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> > const*)+0x278) [0xba8d78]
> >  10: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, 
> > std::less<snapid_t>, std::allocator<snapid_t> > const&, 
> > MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x61e) 
> > [0x72137e]
> >  11: (PG::update_snap_map(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > ObjectStore::Transaction&)+0x402) [0x7d25c2]
> >  12: (PG::append_log(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t, 
> > ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
> >  13: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, eversion_t const&, eversion_t 
> > const&, bool, ObjectStore::Transaction*)+0xba) [0x899eca]
> >  14: (ReplicatedBackend::submit_transaction(hobject_t const&, 
> > eversion_t const&, PGBackend::PGTransaction*, eversion_t const&, 
> > eversion_t const&, std::vector<pg_log_entry_t, 
> > std::allocator<pg_log_entry_t> > const&, 
> > boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, 
> > unsigned long, osd_reqid_t, std::tr1::shared_ptr<OpRequest>)+0x77c) 
> > [0x9f06cc]
> >  15: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa) 
> > [0x8391aa]
> >  16: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd) 
> > [0x88792d]
> >  17: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559) 
> > [0x88cee9]
> >  18: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
> > ThreadPool::TPHandle&)+0x66a) [0x82702a]
> >  19: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
> >  20: (OSD::ShardedOpWQ::_process(unsigned int, 
> > ceph::heartbeat_handle_d*)+0x338) [0x696708]
> >  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x875) 
> > [0xb98555]
> >  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb9a670]
> >  23: (()+0x8182) [0x7f8e57f6c182]
> >  24: (clone()+0x6d) [0x7f8e564d747d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
> > needed to interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 1 lockdep
> >    0/ 1 context
> >    1/ 1 crush
> >    1/ 5 mds
> >    1/ 5 mds_balancer
> >    1/ 5 mds_locker
> >    1/ 5 mds_log
> >    1/ 5 mds_log_expire
> >    1/ 5 mds_migrator
> >    0/ 1 buffer
> >    0/ 1 timer
> >    0/ 1 filer
> >    0/ 1 striper
> >    0/ 1 objecter
> >    0/ 5 rados
> >    0/ 5 rbd
> >    0/ 5 rbd_replay
> >    0/ 5 journaler
> >    0/ 5 objectcacher
> >    0/ 5 client
> >    0/ 5 osd
> >    0/ 5 optracker
> >    0/ 5 objclass
> >    1/ 3 filestore
> >    1/ 3 keyvaluestore
> >    1/ 3 journal
> >    0/ 5 ms
> >    1/ 5 mon
> >    0/10 monc
> >    1/ 5 paxos
> >    0/ 5 tp
> >    1/ 5 auth
> >    1/ 5 crypto
> >    1/ 1 finisher
> >    1/ 5 heartbeatmap
> >    1/ 5 perfcounter
> >    1/ 5 rgw
> >    1/10 civetweb
> >    1/ 5 javaclient
> >    1/ 5 asok
> >    1/ 1 throttle
> >    0/ 0 refs
> >    1/ 5 xio
> >   -2/-2 (syslog threshold)
> >   -1/-1 (stderr threshold)
> >   max_recent     10000
> >   max_new         1000
> >   log_file /var/log/ceph/ceph-osd.55.log
> > --- end dump of recent events ---
> >
> > On 03/30/2016 11:36 PM, Mart van Santen wrote:
> >> Hi there,
> >>
> >> With the help of a lot of people we were able to repair the PG and
> >> restored service. We will get back on this later with a full report for
> >> future reference.
> >>
> >> Regards,
> >>
> >> Mart
> >>
> >>
> >> On 03/30/2016 08:30 PM, Wido den Hollander wrote:
> >>> Hi,
> >>>
> >>> I have an issue with a Ceph cluster which I can't resolve.
> >>>
> >>> Due to OSD failure a PG is incomplete, but I can't query the PG to see
> >>> what I
> >>> can do to fix it.
> >>>
> >>>       health HEALTH_WARN
> >>>              1 pgs incomplete
> >>>              1 pgs stuck inactive
> >>>              1 pgs stuck unclean
> >>>              98 requests are blocked > 32 sec
> >>>
> >>> $ ceph pg 3.117 query
> >>>
> >>> That will hang for ever.
> >>>
> >>> $ ceph pg dump_stuck
> >>>
> >>> pg_stat	state	up	up_primary	acting	acting_primary
> >>> 3.117	incomplete	[68,55,74]	68	[68,55,74]	68
> >>>
> >>> The primary PG in this case is osd.68 . If I stop the OSD the PG query
> >>> works,
> >>> but it says that bringing osd 68 back online will probably help.
> >>>
> >>> The 98 requests which are blocked are also on osd.68 and they all say:
> >>>
> >>> They all say:
> >>> - initiated
> >>> - reached_pg
> >>>
> >>> The cluster is running Hammer 0.94.5 in this case.
> >>>
> >>>  From what I know a OSD had a failing disk and was restarted a couple of
> >>> times
> >>> while the disk gave errors. This caused the PG to become incomplete.
> >>>
> >>> I've set debug osd to 20, but I can't really tell what is going wrong on
> >>> osd.68
> >>> which causes it to stall this long.
> >>>
> >>> Any idea what to do here to get this PG up and running again?
> >>>
> >>> Wido
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users@xxxxxxxxxxxxxx
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> -- 
> Goncalo Borges
> Research Computing
> ARC Centre of Excellence for Particle Physics at the Terascale
> School of Physics A28 | University of Sydney, NSW  2006
> T: +61 2 93511937
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com