Hi Mart, Wido...
A disclaimer: Not really an expert, just a regular site admin
sharing my experience.
At the beginning of the thread you give the idea that only osd.68
has problems dealing with the problematic PG 3.117. If that is
indeed the case, you could simply mark that osd.68 down and remove
it from the cluster. This will trigger Ceph to replicate all PGs in
osd.68 to other osds based on other PG replicas.
However, In the last email, you seem to give the idea that it is the
PG 3.117 which has problems, which makes all osds which share that
PG also problematic. Because of that you marked all osds sharing
that PG as down.
Before actually trying something more drastic, I would go for a more
classic approach. For example, what happens if you turn only one osd
up? I would start with osd.74 since you suspect of problems in
osd.68 and osd.55 was the reason for the dump message bellow. If it
still aborts than it means that the PG might have been replicated
everywhere with 'bad' data.
The drastic approach (If you do not care about data on that PG), is
to mark those osds has down, and force that PG to be recreated using
'ceph pg force_create_pg 3.117'. Based on my previous experience,
once I've recreated a PG, 'ceph pg dump_stuck stale' showed that PG
is creating state forever. To make it right, I had to restart the
proper osds. But, as you stated, you then have to deal with data
corruption at the VMs level... Maybe that is a problem, maybe it
isn't...
Hope that helps
Cheers
Goncalo
On 03/31/2016 12:26 PM, Mart van Santen
wrote:
Hello,
Well unfortunately the problem is not really solved. Yes, we
managed to get to a good health state at some point, when a client
hits some specific data, the osd process crashes with below
errors. The 3 OSD which handle 3.117, the PG with problems, are
currently down and reweighted them to 0, so non-affected PGs are
currently rebuild on other OSDs
If I put them crashed osd up, the do crash again within a few
minutes.
As I'm a bit afraid for the data in this PG, I think we want to
recreate the PG with empty data and discard the old disks. I
understand I will get datacorruption on serveral RBDs in this
case, but we will try to solve that and rebuild the affected VMs.
Does this makes sense and what are the best next steps?
Regards,
Mart
works
-34> 2016-03-31 03:07:56.932800 7f8e43829700 3 osd.55
122203 handle_osd_map epochs [122203,122203], i have 122203, src
has [120245,122203]
-33> 2016-03-31 03:07:56.932837 7f8e43829700 1 --
[2a00:c6c0:0:122::105]:6822/11703 <== osd.45
[2a00:c6c0:0:122::103]:6800/1852 7 ==== pg_info(1 pgs
e122202:3.117) v4 ==== 919+0+0 (3389909573 0 0) 0x528bc00 con
0x1200a840
-32> 2016-03-31 03:07:56.932855 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932770, event:
header_read, op: pg_info(1 pgs e122202:3.117)
-31> 2016-03-31 03:07:56.932869 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932771, event:
throttled, op: pg_info(1 pgs e122202:3.117)
-30> 2016-03-31 03:07:56.932878 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932822, event:
all_read, op: pg_info(1 pgs e122202:3.117)
-29> 2016-03-31 03:07:56.932886 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932851, event:
dispatched, op: pg_info(1 pgs e122202:3.117)
-28> 2016-03-31 03:07:56.932895 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932895, event:
waiting_for_osdmap, op: pg_info(1 pgs e122202:3.117)
-27> 2016-03-31 03:07:56.932912 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932912, event:
started, op: pg_info(1 pgs e122202:3.117)
-26> 2016-03-31 03:07:56.932947 7f8e43829700 5 -- op
tracker -- seq: 22, time: 2016-03-31 03:07:56.932947, event: done,
op: pg_info(1 pgs e122202:3.117)
-25> 2016-03-31 03:07:56.933022 7f8e3c01a700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::103]:6800/1852 -- osd_map(122203..122203 src has
121489..122203) v3 -- ?+0 0x11c7fd40 con 0x1200a840
-24> 2016-03-31 03:07:56.933041 7f8e3c01a700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::103]:6800/1852 -- pg_info(1 pgs e122203:3.117)
v4 -- ?+0 0x528bde0 con 0x1200a840
-23> 2016-03-31 03:07:56.933111 7f8e3c01a700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::105]:6810/3568 -- osd_map(122203..122203 src has
121489..122203) v3 -- ?+0 0x12200d00 con 0x1209d4a0
-22> 2016-03-31 03:07:56.933125 7f8e3c01a700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::105]:6810/3568 -- pg_info(1 pgs e122203:3.117)
v4 -- ?+0 0x5288960 con 0x1209d4a0
-21> 2016-03-31 03:07:56.933154 7f8e3c01a700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::108]:6816/1002847 -- pg_info(1 pgs
e122203:3.117) v4 -- ?+0 0x5288d20 con 0x101a19c0
-20> 2016-03-31 03:07:56.933212 7f8e3c01a700 5 osd.55
pg_epoch: 122203 pg[3.117( v 122193'1898519
(108032'1895437,122193'1898519] local-les=122202 n=2789 ec=23736
les/c 122202/122047 122062/122201/122201) [72,54,45]/[55] r=0
lpr=122201 pi=122046-122200/51 bft=45,54,72 crt=122133'1898514
lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped] on
activate: bft=45,54,72 from 0//0//-1
-19> 2016-03-31 03:07:56.933232 7f8e3c01a700 5 osd.55
pg_epoch: 122203 pg[3.117( v 122193'1898519
(108032'1895437,122193'1898519] local-les=122202 n=2789 ec=23736
les/c 122202/122047 122062/122201/122201) [72,54,45]/[55] r=0
lpr=122201 pi=122046-122200/51 bft=45,54,72 crt=122133'1898514
lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped] target
shard 45 from 0//0//-1
-18> 2016-03-31 03:07:56.933244 7f8e3c01a700 5 osd.55
pg_epoch: 122203 pg[3.117( v 122193'1898519
(108032'1895437,122193'1898519] local-les=122202 n=2789 ec=23736
les/c 122202/122047 122062/122201/122201) [72,54,45]/[55] r=0
lpr=122201 pi=122046-122200/51 bft=45,54,72 crt=122133'1898514
lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped] target
shard 54 from 0//0//-1
-17> 2016-03-31 03:07:56.933255 7f8e3c01a700 5 osd.55
pg_epoch: 122203 pg[3.117( v 122193'1898519
(108032'1895437,122193'1898519] local-les=122202 n=2789 ec=23736
les/c 122202/122047 122062/122201/122201) [72,54,45]/[55] r=0
lpr=122201 pi=122046-122200/51 bft=45,54,72 crt=122133'1898514
lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped] target
shard 72 from 0//0//-1
-16> 2016-03-31 03:07:56.933283 7f8e3680f700 5 -- op
tracker -- seq: 20, time: 2016-03-31 03:07:56.933283, event:
reached_pg, op: osd_op(client.776466.1:190178605
rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint
object_size 4194304 write_size 4194304,write 8192~8192] 3.b1492517
RETRY=1 snapc 8b3=[8b3] ondisk+retry+write e122203)
-15> 2016-03-31 03:07:56.933507 7f8e3680f700 5 -- op
tracker -- seq: 20, time: 2016-03-31 03:07:56.933507, event:
started, op: osd_op(client.776466.1:190178605
rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint
object_size 4194304 write_size 4194304,write 8192~8192] 3.b1492517
RETRY=1 snapc 8b3=[8b3] ondisk+retry+write e122203)
-14> 2016-03-31 03:07:56.933648 7f8e3680f700 5 -- op
tracker -- seq: 20, time: 2016-03-31 03:07:56.933648, event:
waiting for subops from 45,54,72, op:
osd_op(client.776466.1:190178605
rbd_data.900a62ae8944a.0000000000000829 [set-alloc-hint
object_size 4194304 write_size 4194304,write 8192~8192] 3.b1492517
RETRY=1 snapc 8b3=[8b3] ondisk+retry+write e122203)
-13> 2016-03-31 03:07:56.933682 7f8e3680f700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::103]:6800/1852 --
osd_repop(client.776466.1:190178605 3.117
b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v
122203'1898521) v1 -- ?+46 0x11e96400 con 0x1200a840
-12> 2016-03-31 03:07:56.933712 7f8e3680f700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::105]:6810/3568 --
osd_repop(client.776466.1:190178605 3.117
b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v
122203'1898521) v1 -- ?+46 0x11e96a00 con 0x1209d4a0
-11> 2016-03-31 03:07:56.933735 7f8e3680f700 1 --
[2a00:c6c0:0:122::105]:6822/11703 -->
[2a00:c6c0:0:122::108]:6816/1002847 --
osd_repop(client.776466.1:190178605 3.117
b1492517/rbd_data.900a62ae8944a.0000000000000829/head//3 v
122203'1898521) v1 -- ?+46 0x11e97600 con 0x101a19c0
-10> 2016-03-31 03:07:56.935173 7f8e30ef5700 1 --
[2a00:c6c0:0:122::105]:6822/11703 <== osd.72
[2a00:c6c0:0:122::108]:6816/1002847 9 ====
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0) v1 ==== 83+0+0 (405786713 0 0) 0x11e66d00 con 0x101a19c0
-9> 2016-03-31 03:07:56.935212 7f8e30ef5700 5 -- op
tracker -- seq: 23, time: 2016-03-31 03:07:56.935087, event:
header_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-8> 2016-03-31 03:07:56.935224 7f8e30ef5700 5 -- op
tracker -- seq: 23, time: 2016-03-31 03:07:56.935090, event:
throttled, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-7> 2016-03-31 03:07:56.935234 7f8e30ef5700 5 -- op
tracker -- seq: 23, time: 2016-03-31 03:07:56.935162, event:
all_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-6> 2016-03-31 03:07:56.935245 7f8e30ef5700 5 -- op
tracker -- seq: 23, time: 0.000000, event: dispatched, op:
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0)
-5> 2016-03-31 03:07:56.936129 7f8e2dfc6700 1 --
[2a00:c6c0:0:122::105]:6822/11703 <== osd.45
[2a00:c6c0:0:122::103]:6800/1852 8 ====
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0) v1 ==== 83+0+0 (3967999676 0 0) 0x11c7fd40 con 0x1200a840
-4> 2016-03-31 03:07:56.936150 7f8e2dfc6700 5 -- op
tracker -- seq: 24, time: 2016-03-31 03:07:56.936086, event:
header_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-3> 2016-03-31 03:07:56.936159 7f8e2dfc6700 5 -- op
tracker -- seq: 24, time: 2016-03-31 03:07:56.936087, event:
throttled, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-2> 2016-03-31 03:07:56.936166 7f8e2dfc6700 5 -- op
tracker -- seq: 24, time: 2016-03-31 03:07:56.936124, event:
all_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-1> 2016-03-31 03:07:56.936172 7f8e2dfc6700 5 -- op
tracker -- seq: 24, time: 0.000000, event: dispatched, op:
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0)
0> 2016-03-31 03:07:56.940165 7f8e3680f700 -1
osd/SnapMapper.cc: In function 'void SnapMapper::add_oid(const
hobject_t&, const std::set<snapid_t>&,
MapCacher::Transaction<std::basic_string<char>,
ceph::buffer::list>*)' thread 7f8e3680f700 time 2016-03-31
03:07:56.933983
osd/SnapMapper.cc: 228: FAILED assert(r == -2)
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x8b) [0xba8b8b]
2: (SnapMapper::add_oid(hobject_t const&,
std::set<snapid_t, std::less<snapid_t>,
std::allocator<snapid_t> > const&,
MapCacher::Transaction<std::string,
ceph::buffer::list>*)+0x61e) [0x72137e]
3: (PG::update_snap_map(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
ObjectStore::Transaction&)+0x402) [0x7d25c2]
4: (PG::append_log(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&, eversion_t,
eversion_t, ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
5: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, eversion_t
const&, eversion_t const&, bool,
ObjectStore::Transaction*)+0xba) [0x899eca]
6: (ReplicatedBackend::submit_transaction(hobject_t const&,
eversion_t const&, PGBackend::PGTransaction*, eversion_t
const&, eversion_t const&, std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, Context*,
Context*, Context*, unsigned long, osd_reqid_t,
std::tr1::shared_ptr<OpRequest>)+0x77c) [0x9f06cc]
7: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa)
[0x8391aa]
8: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd)
[0x88792d]
9:
(ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559)
[0x88cee9]
10:
(ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x66a) [0x82702a]
11: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
12: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x338) [0x696708]
13: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x875) [0xb98555]
14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0xb9a670]
15: (()+0x8182) [0x7f8e57f6c182]
16: (clone()+0x6d) [0x7f8e564d747d]
NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 keyvaluestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.55.log
--- end dump of recent events ---
2016-03-31 03:07:56.960104 7f8e3680f700 -1 *** Caught signal
(Aborted) **
in thread 7f8e3680f700
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xaaff6a]
2: (()+0x10340) [0x7f8e57f74340]
3: (gsignal()+0x39) [0x7f8e56413cc9]
4: (abort()+0x148) [0x7f8e564170d8]
5: (__gnu_cxx::__verbose_terminate_handler()+0x155)
[0x7f8e56d1e535]
6: (()+0x5e6d6) [0x7f8e56d1c6d6]
7: (()+0x5e703) [0x7f8e56d1c703]
8: (()+0x5e922) [0x7f8e56d1c922]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x278) [0xba8d78]
10: (SnapMapper::add_oid(hobject_t const&,
std::set<snapid_t, std::less<snapid_t>,
std::allocator<snapid_t> > const&,
MapCacher::Transaction<std::string,
ceph::buffer::list>*)+0x61e) [0x72137e]
11: (PG::update_snap_map(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
ObjectStore::Transaction&)+0x402) [0x7d25c2]
12: (PG::append_log(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&, eversion_t,
eversion_t, ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
13: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, eversion_t
const&, eversion_t const&, bool,
ObjectStore::Transaction*)+0xba) [0x899eca]
14: (ReplicatedBackend::submit_transaction(hobject_t const&,
eversion_t const&, PGBackend::PGTransaction*, eversion_t
const&, eversion_t const&, std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, Context*,
Context*, Context*, unsigned long, osd_reqid_t,
std::tr1::shared_ptr<OpRequest>)+0x77c) [0x9f06cc]
15: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa)
[0x8391aa]
16: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd)
[0x88792d]
17:
(ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559)
[0x88cee9]
18:
(ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x66a) [0x82702a]
19: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
20: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x338) [0x696708]
21: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x875) [0xb98555]
22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0xb9a670]
23: (()+0x8182) [0x7f8e57f6c182]
24: (clone()+0x6d) [0x7f8e564d747d]
NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.
--- begin dump of recent events ---
-7> 2016-03-31 03:07:56.945486 7f8e43829700 1 --
[2a00:c6c0:0:122::105]:6822/11703 <== osd.54
[2a00:c6c0:0:122::105]:6810/3568 7 ==== osd_map(122203..122203 src
has 121489..122203) v3 ==== 222+0+0 (2966331141 0 0) 0x12200d00
con 0x1209d4a0
-6> 2016-03-31 03:07:56.945514 7f8e43829700 3 osd.55
122203 handle_osd_map epochs [122203,122203], i have 122203, src
has [121489,122203]
-5> 2016-03-31 03:07:56.945517 7f8e2d6bd700 1 --
[2a00:c6c0:0:122::105]:6822/11703 <== osd.54
[2a00:c6c0:0:122::105]:6810/3568 8 ====
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0) v1 ==== 83+0+0 (4008969226 0 0) 0x11e661c0 con 0x1209d4a0
-4> 2016-03-31 03:07:56.945538 7f8e2d6bd700 5 -- op
tracker -- seq: 25, time: 2016-03-31 03:07:56.945488, event:
header_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-3> 2016-03-31 03:07:56.945545 7f8e2d6bd700 5 -- op
tracker -- seq: 25, time: 2016-03-31 03:07:56.945489, event:
throttled, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-2> 2016-03-31 03:07:56.945549 7f8e2d6bd700 5 -- op
tracker -- seq: 25, time: 2016-03-31 03:07:56.945512, event:
all_read, op: osd_repop_reply(client.776466.1:190178605 3.117
ondisk, result = 0)
-1> 2016-03-31 03:07:56.945552 7f8e2d6bd700 5 -- op
tracker -- seq: 25, time: 0.000000, event: dispatched, op:
osd_repop_reply(client.776466.1:190178605 3.117 ondisk, result =
0)
0> 2016-03-31 03:07:56.960104 7f8e3680f700 -1 *** Caught
signal (Aborted) **
in thread 7f8e3680f700
ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
1: /usr/bin/ceph-osd() [0xaaff6a]
2: (()+0x10340) [0x7f8e57f74340]
3: (gsignal()+0x39) [0x7f8e56413cc9]
4: (abort()+0x148) [0x7f8e564170d8]
5: (__gnu_cxx::__verbose_terminate_handler()+0x155)
[0x7f8e56d1e535]
6: (()+0x5e6d6) [0x7f8e56d1c6d6]
7: (()+0x5e703) [0x7f8e56d1c703]
8: (()+0x5e922) [0x7f8e56d1c922]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x278) [0xba8d78]
10: (SnapMapper::add_oid(hobject_t const&,
std::set<snapid_t, std::less<snapid_t>,
std::allocator<snapid_t> > const&,
MapCacher::Transaction<std::string,
ceph::buffer::list>*)+0x61e) [0x72137e]
11: (PG::update_snap_map(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
ObjectStore::Transaction&)+0x402) [0x7d25c2]
12: (PG::append_log(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&, eversion_t,
eversion_t, ObjectStore::Transaction&, bool)+0x4e8) [0x7d2c68]
13: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, eversion_t
const&, eversion_t const&, bool,
ObjectStore::Transaction*)+0xba) [0x899eca]
14: (ReplicatedBackend::submit_transaction(hobject_t const&,
eversion_t const&, PGBackend::PGTransaction*, eversion_t
const&, eversion_t const&, std::vector<pg_log_entry_t,
std::allocator<pg_log_entry_t> > const&,
boost::optional<pg_hit_set_history_t>&, Context*,
Context*, Context*, unsigned long, osd_reqid_t,
std::tr1::shared_ptr<OpRequest>)+0x77c) [0x9f06cc]
15: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*)+0x7aa)
[0x8391aa]
16: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xbdd)
[0x88792d]
17:
(ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x4559)
[0x88cee9]
18:
(ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&,
ThreadPool::TPHandle&)+0x66a) [0x82702a]
19: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3bd) [0x6961dd]
20: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x338) [0x696708]
21: (ShardedThreadPool::shardedthreadpool_worker(unsigned
int)+0x875) [0xb98555]
22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
[0xb9a670]
23: (()+0x8182) [0x7f8e57f6c182]
24: (clone()+0x6d) [0x7f8e564d747d]
NOTE: a copy of the executable, or `objdump -rdS
<executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 keyvaluestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-osd.55.log
--- end dump of recent events ---
On 03/30/2016 11:36 PM, Mart van
Santen wrote:
Hi there,
With the help of a lot of people we were able to repair the PG and
restored service. We will get back on this later with a full report for
future reference.
Regards,
Mart
On 03/30/2016 08:30 PM, Wido den Hollander wrote:
Hi,
I have an issue with a Ceph cluster which I can't resolve.
Due to OSD failure a PG is incomplete, but I can't query the PG to see what I
can do to fix it.
health HEALTH_WARN
1 pgs incomplete
1 pgs stuck inactive
1 pgs stuck unclean
98 requests are blocked > 32 sec
$ ceph pg 3.117 query
That will hang for ever.
$ ceph pg dump_stuck
pg_stat state up up_primary acting acting_primary
3.117 incomplete [68,55,74] 68 [68,55,74] 68
The primary PG in this case is osd.68 . If I stop the OSD the PG query works,
but it says that bringing osd 68 back online will probably help.
The 98 requests which are blocked are also on osd.68 and they all say:
They all say:
- initiated
- reached_pg
The cluster is running Hammer 0.94.5 in this case.
>From what I know a OSD had a failing disk and was restarted a couple of times
while the disk gave errors. This caused the PG to become incomplete.
I've set debug osd to 20, but I can't really tell what is going wrong on osd.68
which causes it to stall this long.
Any idea what to do here to get this PG up and running again?
Wido
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW 2006
T: +61 2 93511937
|