Re: OSD respawning -- FAILED assert(clone_size.count(clone))

David Zafman <dzafman@xxxxxxxxxx> · Tue, 8 Sep 2015 13:47:49 -0700

Chris,

I was wondering if you still had /tmp/snap.out laying around, could you 
send it to me?   The way the dump to json code works if the "clones" is 
empty it doesn't show me what two other structures look like.

David

On 9/5/15 3:24 PM, Chris Taylor wrote:
# ceph-dencoder type SnapSet import /tmp/snap.out decode dump_json
{
    "snap_context": {
        "seq": 9197,
        "snaps": [
            9197
        ]
    },
    "head_exists": 1,
    "clones": []
}

On 09/03/2015 04:48 PM, David Zafman wrote:

If you have ceph-dencoder installed or can build v0.94.3 to build the 
binary, you can dump the SnapSet for the problem object. Once you 
understand the removal procedure you could do the following to get a 
look at the SnapSet information.

Find the object from --op list with snapid -2 and cut and paste that 
json into the following command

Something like:
$ ceph-objectstore-tool --data-path xxxxxx --journal-path xxxxxx 
'["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid":-2,"hash":###,"max":0,"pool":3,"namespace":"","max":0}] 
' get-attr snapset > /tmp/snap.out

$ ceph-dencoder type SnapSet import /tmp/snap.out decode dump_json

<Example output>
{
    "snap_context": {
        "seq": 4,
        "snaps": [
            4,
            3,
            2,
            1
        ]
    },
    "head_exists": 1,
    "clones": [
        {
            "snap": 1,
            "size": 1032,
            "overlap": "[]"
        },
        {
            "snap": 2,
            "size": 452,
            "overlap": "[]"
        },
        {
            "snap": 3,
            "size": 452,
            "overlap": "[]"
        },
        {
            "snap": 4,
            "size": 452,
            "overlap": "[]"
        }
    ]
}

On 9/3/15 2:44 PM, David Zafman wrote:

Chris,

WARNING: Do this at your own risk.  You are deleting one of the 
snapshots of a specific portion of an rbd image.  I'm not sure how 
rbd will react.  Maybe you should repair the SnapSet instead of 
remove the inconsistency.   However, as far as I know there isn't a 
tool to it.

If you are able to build from Ceph source, I happen to have an 
enhancement to ceph-objectstore-tool to output the SnapSet.

-------

The message preceding the assert is in the same thread so " 
rb.0.8c2990.238e1f29.000000008cc0/23ed//3" has the object name in 
it.  The 23ed is the RADOS clone/snap ID.

First, get a backup by export the pg using the 
ceph-objectstore-tool.  Specify a --file somewhere with enough of 
disk space.

$ ceph-objectstore-tool --data-path xxxxxx --journal-path xxxxxx 
--op export --pgid 3.f9 --file xxxxdestinationxxxx
Exporting 3.f9
....
Read 3/c55800f9/rb.0.8c2990.238e1f29.000000008cc0/23ed
....
Export successful

Now you need the JSON of the object in question.  The 3rd line of 
output has the snapid 9197 which is 23ed in decimal.

$ ceph-objectstore-tool --data-path xxxxxx --journal-path xxxxxx 
--op list rb.0.8c2990.238e1f29.000000008cc0

<Example output>
["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid":9196,"hash":###,"max":0,"pool":3,"namespace":"","max":0}] 

["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid",9197,"hash":###,"max":0,"pool":3,"namespace":"","max":0}] 

["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid":9198,"hash":###,"max":0,"pool":3,"namespace":"","max":0}] 

["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid":-2,"hash":###,"max":0,"pool":3,"namespace":"","max":0}] 

To remove it, cut and paste your output line with snapid 9197 inside 
single quotes like this:

$ ceph-objectstore-tool --data-path xxxxxx --journal-path xxxxxx 
'["3.f9",{"oid":"rb.0.8c2990.238e1f29.000000008cc0","key":"","snapid":9197,"hash":###,"max":0,"pool":3,"namespace":"","max":0}]' 
remove

<Example output>
remove 3/c55800f9/rb.0.8c2990.238e1f29.000000008cc0/23ed

To get all the OSDs to boot you'll have to do the remove on all OSDs 
that contain this PG and have an entry with snapid 9197 for this 
object.

David

On 9/3/15 11:29 AM, Chris Taylor wrote:
On 09/03/2015 10:20 AM, David Zafman wrote:

This crash is what happens if a clone is missing from SnapSet 
(internal data) for an object in the ObjectStore. If you had out 
of space issues, this could possibly have been caused by being 
able to rename or create files in a directory, but not being able 
to update SnapSet.

I've completely rewritten that logic so scrub doesn't crash, but 
it hasn't been in a release yet.  In the future scrub will just 
report an unexpected clone in the ObjectStore.

You'll need to find and remove the extraneous clone. Bump the 
"debug osd" to 20 so that you'll get the name of the object in the 
log.  Start an OSD and after it crashes examine the log. Then 
remove the extraneous object using ceph-objectstore-tool. You'll 
have to repeat this process if there are more of these.

David

I looked for an example of how to use the ceph-objectstore-tool 
aside from what was provided with "-h". I really don't know how to 
read the log output to get the object name. Can you please provide 
an example?

Here is the end of the log after bumping the OSD debug up to 20:

   -11> 2015-09-03 10:52:15.164310 7fb1f4afe700 10 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] 
be_select_auth_object: selecting osd 10 for obj 
1ee800f9/rb.0.4b777.2ae8944a.0000006cfd7e/head//3
   -10> 2015-09-03 10:52:15.164334 7fb1f4afe700 10 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] 
be_select_auth_object: selecting osd 30 for obj 
1ee800f9/rb.0.4b777.2ae8944a.0000006cfd7e/head//3
    -9> 2015-09-03 10:52:15.164359 7fb1f4afe700 10 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] 
be_select_auth_object: selecting osd 10 for obj 
9df800f9/rb.0.4b777.2ae8944a.00000074e565/head//3
    -8> 2015-09-03 10:52:15.164379 7fb1f4afe700 10 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] 
be_select_auth_object: selecting osd 30 for obj 
9df800f9/rb.0.4b777.2ae8944a.00000074e565/head//3
    -7> 2015-09-03 10:52:15.164405 7fb1f4afe700  2 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep]
    -6> 2015-09-03 10:52:15.164425 7fb1f4afe700 10 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] _scrub
    -5> 2015-09-03 10:52:15.164460 7fb1f4afe700 20 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] deep-scrub 
9df800f9/rb.0.4b777.2ae8944a.00000074e565/head//3 
9df800f9/rb.0.4b777.2ae8944a.00000074e565/head//3(8230'3883956 
osd.15.0:341626 wrlock_by=unknown.0.0:0 
dirty|data_digest|omap_digest s 4194304 uv 2001537 dd 9c907b8b od 
ffffffff)
    -4> 2015-09-03 10:52:15.164488 7fb1f4afe700 20 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] deep-scrub 
1ee800f9/rb.0.4b777.2ae8944a.0000006cfd7e/head//3 
1ee800f9/rb.0.4b777.2ae8944a.0000006cfd7e/head//3(9493'6768306 
osd.15.0:23521530 wrlock_by=unknown.0.0:0 
dirty|data_digest|omap_digest s 4194304 uv 4802739 dd 3d8ecc1d od 
ffffffff)
    -3> 2015-09-03 10:52:15.164513 7fb1f4afe700 20 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] deep-scrub 
af6800f9/rb.0.8c2990.238e1f29.0000001d1541/head//3 
af6800f9/rb.0.8c2990.238e1f29.0000001d1541/head//3(11118'10703888 
client.9498558.0:311803 wrlock_by=unknown.0.0:0 dirty|omap_digest s 
4194304 uv 10703888 od ffffffff)
    -2> 2015-09-03 10:52:15.164535 7fb1f4afe700 20 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] deep-scrub 
c55800f9/rb.0.8c2990.238e1f29.000000008cc0/head//3 
c55800f9/rb.0.8c2990.238e1f29.000000008cc0/head//3(11295'10706250 
client.9513567.0:37161 wrlock_by=unknown.0.0:0 dirty|omap_digest s 
4194304 uv 10706250 od ffffffff)
    -1> 2015-09-03 10:52:15.164562 7fb1f4afe700 20 osd.30 pg_epoch: 
227938 pg[3.f9( v 227870'10797633 (211962'10794623,227870'10797633] 
local-les=227938 n=11612 ec=101 les/c 227938/227938 
227937/227937/227937) [30,10] r=0 lpr=227937 crt=227623'10797625 
lcod 0'0 mlcod 0'0 active+clean+scrubbing+deep] deep-scrub 
c55800f9/rb.0.8c2990.238e1f29.000000008cc0/23ed//3 
c55800f9/rb.0.8c2990.238e1f29.000000008cc0/23ed//3(11044'10472430 
client.9186139.0:268115 [23ed] dirty|omap_digest s 4194304 uv 
9525846 od ffffffff)
     0> 2015-09-03 10:52:15.170668 7fb1f4afe700 -1 
osd/osd_types.cc: In function 'uint64_t 
SnapSet::get_clone_bytes(snapid_t) const' thread 7fb1f4afe700 time 
2015-09-03 10:52:15.164585
osd/osd_types.cc: 4076: FAILED assert(clone_size.count(clone))

Thanks,
Chris

On 9/3/15 2:22 AM, Gregory Farnum wrote:
On Thu, Sep 3, 2015 at 7:48 AM, Chris Taylor <ctaylor@xxxxxxxxxx> 
wrote:
I removed the latest OSD that was respawing (osd.23) and now I 
having the
same problem with osd.30. It looks like they both have pg 3.f9 
in common. I
tried "ceph pg repair 3.f9" but the OSD is still respawning.

Does anyone have any ideas?
This is definitely something wrong with the disk state on the 
OSD. Are
you using cache pools or any other features that aren't so standard?
What workload is running against this cluster? How are the OSDs
configured (btrfs/xfs/ext4, on single hard drives, etc)?
David might have some ideas what's gone wrong or what else 
besides the
log tail would be needed to diagnose.
-Greg

Thanks,
Chris

ceph-osd-03:ceph-osd.30.log

    -29> 2015-09-02 23:31:44.041181 7fbd1bf01700  0 
log_channel(cluster) log
[INF] : 3.f9 deep-scrub starts
    -28> 2015-09-02 23:31:44.042339 7fbd1bf01700  1 -- 
10.21.0.23:6824/3512
--> 10.21.0.22:6800/3623 -- replica scrub(pg:
3.f9,from:0'0,to:210364'10793453,epoch:222344,start:0//0//-1,end:1ff300f9//0//-1,chunky:1,deep:1,seed:4294967295,version:6) 

v6 -- ?+0 0x19906f80 con 0x186714a0
    -27> 2015-09-02 23:31:44.055769 7fbd27718700  1 -- 
10.20.0.23:6825/3512
<== osd.43 10.20.0.21:0/3850 1 ==== osd_ping(ping e222344 stamp 
2015-09-02
23:31:44.055321) v2 ==== 47+0+0 (2626217624 0 0) 0x4b43800 con 
0x19f466e0
    -26> 2015-09-02 23:31:44.055805 7fbd27718700  1 -- 
10.20.0.23:6825/3512
--> 10.20.0.21:0/3850 -- osd_ping(ping_reply e222344 stamp 
2015-09-02
23:31:44.055321) v2 -- ?+0 0x19adba00 con 0x19f466e0
    -25> 2015-09-02 23:31:44.056016 7fbd25f15700  1 -- 
10.21.0.23:6825/3512
<== osd.43 10.20.0.21:0/3850 1 ==== osd_ping(ping e222344 stamp 
2015-09-02
23:31:44.055321) v2 ==== 47+0+0 (2626217624 0 0) 0x4a72200 con 
0x19f46160
    -24> 2015-09-02 23:31:44.056043 7fbd25f15700  1 -- 
10.21.0.23:6825/3512
--> 10.20.0.21:0/3850 -- osd_ping(ping_reply e222344 stamp 
2015-09-02
23:31:44.055321) v2 -- ?+0 0x1b9b4000 con 0x19f46160
    -23> 2015-09-02 23:31:44.111037 7fbd177f2700  1 -- 
10.21.0.23:6824/3512
<== osd.10 10.21.0.22:6800/3623 28 ==== osd_sub_op(unknown.0.0:0 
3.f9
0//0//-1 [scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[]) v11 ==== 
1226+0+13977
(2839644439 0 129553580) 0x19a3f700 con 0x186714a0
    -22> 2015-09-02 23:31:44.111071 7fbd177f2700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.110934, event: header_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -21> 2015-09-02 23:31:44.111079 7fbd177f2700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.110936, event: throttled, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -20> 2015-09-02 23:31:44.111085 7fbd177f2700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.111028, event: all_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -19> 2015-09-02 23:31:44.111090 7fbd177f2700  5 -- op 
tracker -- seq:
354, time: 0.000000, event: dispatched, op: 
osd_sub_op(unknown.0.0:0 3.f9
0//0//-1 [scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[])
    -18> 2015-09-02 23:31:44.111142 7fbd1f708700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.111142, event: reached_pg, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -17> 2015-09-02 23:31:44.111167 7fbd1f708700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.111167, event: started, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -16> 2015-09-02 23:31:44.111262 7fbd1f708700  5 -- op 
tracker -- seq:
354, time: 2015-09-02 23:31:44.111262, event: done, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
    -15> 2015-09-02 23:31:44.111374 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep] 
scrub_compare_maps   osd.30 has
24 items
    -14> 2015-09-02 23:31:44.111508 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep] scrub_compare_maps 
replica 10
has 24 items
    -13> 2015-09-02 23:31:44.111737 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep]
    -12> 2015-09-02 23:31:44.113160 7fbd1bf01700  1 -- 
10.21.0.23:6824/3512
--> 10.21.0.22:6800/3623 -- replica scrub(pg:
3.f9,from:0'0,to:210364'10793453,epoch:222344,start:1ff300f9//0//-1,end:9ef800f9//0//-1,chunky:1,deep:1,seed:4294967295,version:6) 

v6 -- ?+0 0x1cf24800 con 0x186714a0
    -11> 2015-09-02 23:31:44.172116 7fbd177f2700  1 -- 
10.21.0.23:6824/3512
<== osd.10 10.21.0.22:6800/3623 29 ==== osd_sub_op(unknown.0.0:0 
3.f9
0//0//-1 [scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[]) v11 ==== 
1226+0+16200
(2839644439 0 4288175685) 0x19a40d00 con 0x186714a0
    -10> 2015-09-02 23:31:44.172150 7fbd177f2700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172018, event: header_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -9> 2015-09-02 23:31:44.172158 7fbd177f2700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172020, event: throttled, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -8> 2015-09-02 23:31:44.172163 7fbd177f2700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172105, event: all_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -7> 2015-09-02 23:31:44.172168 7fbd177f2700  5 -- op 
tracker -- seq:
355, time: 0.000000, event: dispatched, op: 
osd_sub_op(unknown.0.0:0 3.f9
0//0//-1 [scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[])
     -6> 2015-09-02 23:31:44.172217 7fbd1cf03700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172217, event: reached_pg, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -5> 2015-09-02 23:31:44.172233 7fbd1cf03700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172233, event: started, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -4> 2015-09-02 23:31:44.172319 7fbd1cf03700  5 -- op 
tracker -- seq:
355, time: 2015-09-02 23:31:44.172319, event: done, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -3> 2015-09-02 23:31:44.172365 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep] 
scrub_compare_maps   osd.30 has
25 items
     -2> 2015-09-02 23:31:44.172394 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep] scrub_compare_maps 
replica 10
has 25 items
     -1> 2015-09-02 23:31:44.172673 7fbd1bf01700  2 osd.30 
pg_epoch: 222344
pg[3.f9( v 222344'10796467 (210364'10793453,222344'10796467]
local-les=222344 n=11597 ec=101 les/c 222344/222344 
222343/222343/222343)
[30,10] r=0 lpr=222343 crt=222335'10796453 lcod 222344'10796466 
mlcod
222344'10796466 active+clean+scrubbing+deep]
      0> 2015-09-02 23:31:44.176301 7fbd1bf01700 -1 
osd/osd_types.cc: In
function 'uint64_t SnapSet::get_clone_bytes(snapid_t) const' thread
7fbd1bf01700 time 2015-09-02 23:31:44.172716
osd/osd_types.cc: 4076: FAILED assert(clone_size.count(clone))

  ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x8b) [0xbc376b]
  2: (SnapSet::get_clone_bytes(snapid_t) const+0xb6) [0x79dc06]
  3: (ReplicatedPG::_scrub(ScrubMap&, std::map<hobject_t, 
std::pair<unsigned
int, unsigned int>, std::less<hobject_t>, 
std::allocator<std::pair<hobject_t
const, std::pair<unsigned int, unsigned int> > > > 
const&)+0xa2a) [0x8852da]
  4: (PG::scrub_compare_maps()+0xd89) [0x7f45a9]
  5: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x1ee) [0x7f770e]
  6: (PG::scrub(ThreadPool::TPHandle&)+0x2ee) [0x7f907e]
  7: (OSD::ScrubWQ::_process(PG*, ThreadPool::TPHandle&)+0x19) 
[0x6cc9d9]
  8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb425e]
  9: (ThreadPool::WorkThread::entry()+0x10) [0xbb5300]
  10: (()+0x8182) [0x7fbd4113c182]
  11: (clone()+0x6d) [0x7fbd3f6a747d]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` 
is needed to
interpret this.

On 08/28/2015 11:55 AM, Chris Taylor wrote:
Fellow Ceph Users,

I have 3 OSD nodes and 3 MONs on separate servers. Our storage 
was near
full on some OSD so we added additional drives, almost doubling 
our space.
Since then we are getting OSDs that are respawning. We added 
additional RAM
to the OSD nodes, from 12G to 24G. It started with one OSD 
respawning so we
reweighted it to 0, then removed it from the CRUSHMAP. Then 
another OSD
started doing the same thing. We repeated the process again. We 
are now one
the 5th drive. Some of these drives are new and we're thinking 
there may be
something else going on than just bad/corrupt drives. Can 
someone please
help?

Here are some snippets from the logs files. If there is any 
thing else you
want to see let me know.

Thanks,
Chris

ceph-osd-03:syslog

Aug 27 11:33:37 ceph-osd-03 kernel: [380304.744712] init: ceph-osd
(ceph/23) main process (458) killed by ABRT signal
Aug 27 11:33:37 ceph-osd-03 kernel: [380304.744736] init: ceph-osd
(ceph/23) main process ended, respawning
Aug 27 11:33:49 ceph-osd-03 kernel: [380315.871768] init: ceph-osd
(ceph/23) main process (938) killed by ABRT signal
Aug 27 11:33:49 ceph-osd-03 kernel: [380315.871791] init: ceph-osd
(ceph/23) main process ended, respawning
Aug 27 11:34:00 ceph-osd-03 kernel: [380327.527056] init: ceph-osd
(ceph/23) main process (1463) killed by ABRT signal
Aug 27 11:34:00 ceph-osd-03 kernel: [380327.527079] init: ceph-osd
(ceph/23) main process ended, respawning
Aug 27 11:34:13 ceph-osd-03 kernel: [380340.159178] init: ceph-osd
(ceph/23) main process (1963) killed by ABRT signal
Aug 27 11:34:13 ceph-osd-03 kernel: [380340.159228] init: ceph-osd
(ceph/23) main process ended, respawning
Aug 27 11:34:24 ceph-osd-03 kernel: [380350.843268] init: ceph-osd
(ceph/23) main process (2478) killed by ABRT signal
Aug 27 11:34:24 ceph-osd-03 kernel: [380350.843282] init: ceph-osd
(ceph/23) main process ended, respawning

ceph-osd-03:ceph-osd.23.log

    -11> 2015-08-27 11:37:53.054359 7fef96fa4700 1 --
10.21.0.23:6802/10219 <== osd.10 10.21.0.22:6800/3623 21 ====
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[]) v11 ==== 1226+0+15250 (4145454591 0 2457813372) 
0x10229600 con
0xf885440
    -10> 2015-08-27 11:37:53.054407 7fef96fa4700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054223, event: header_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -9> 2015-08-27 11:37:53.054420 7fef96fa4700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054226, event: throttled, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -8> 2015-08-27 11:37:53.054427 7fef96fa4700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054348, event: all_read, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -7> 2015-08-27 11:37:53.054434 7fef96fa4700 5 -- op 
tracker -- seq:
390, time: 0.000000, event: dispatched, op: 
osd_sub_op(unknown.0.0:0 3.f9
0//0//-1 [scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[])
     -6> 2015-08-27 11:37:53.054571 7fef9ecdb700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054570, event: reached_pg, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -5> 2015-08-27 11:37:53.054606 7fef9ecdb700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054606, event: started, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -4> 2015-08-27 11:37:53.054737 7fef9ecdb700 5 -- op 
tracker -- seq:
390, time: 2015-08-27 11:37:53.054737, event: done, op:
osd_sub_op(unknown.0.0:0 3.f9 0//0//-1 [scrub-map] v 0'0 
snapset=0=[]:[]
snapc=0=[])
     -3> 2015-08-27 11:37:53.054832 7fef9dcd9700 2 osd.23 
pg_epoch: 141062
pg[3.f9( v 141062'10774941 (137054'10771888,141062'10774941]
local-les=141062 n=11076 ec=101 les/c 141062/141062 
141061/141061/141061)
[23,10] r=0 lpr=141061 crt=141053'10774936 lcod 141062'10774940 
mlcod
141062'10774940 active+clean+scrubbing+deep] 
scrub_compare_maps   osd.23 has
25 items
     -2> 2015-08-27 11:37:53.054877 7fef9dcd9700 2 osd.23 
pg_epoch: 141062
pg[3.f9( v 141062'10774941 (137054'10771888,141062'10774941]
local-les=141062 n=11076 ec=101 les/c 141062/141062 
141061/141061/141061)
[23,10] r=0 lpr=141061 crt=141053'10774936 lcod 141062'10774940 
mlcod
141062'10774940 active+clean+scrubbing+deep] scrub_compare_maps 
replica 10
has 25 items
     -1> 2015-08-27 11:37:53.055207 7fef9dcd9700 2 osd.23 
pg_epoch: 141062
pg[3.f9( v 141062'10774941 (137054'10771888,141062'10774941]
local-les=141062 n=11076 ec=101 les/c 141062/141062 
141061/141061/141061)
[23,10] r=0 lpr=141061 crt=141053'10774936 lcod 141062'10774940 
mlcod
141062'10774940 active+clean+scrubbing+deep]
      0> 2015-08-27 11:37:53.060242 7fef9dcd9700 -1 
osd/osd_types.cc: In
function 'uint64_t SnapSet::get_clone_bytes(snapid_t) const' 
thread
7fef9dcd9700 time 2015-08-27 11:37:53.055278
osd/osd_types.cc: 4074: FAILED assert(clone_size.count(clone))

  ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x8b) [0xbc2b8b]
  2: (SnapSet::get_clone_bytes(snapid_t) const+0xb6) [0x79da36]
  3: (ReplicatedPG::_scrub(ScrubMap&, std::map<hobject_t,
std::pair<unsigned int, unsigned int>, std::less<hobject_t>,
std::allocator<std::pair<hobject_t const, std::pair<unsigned 
int, unsigned
int> > > > const&)+0xa2a) [0x88491a]
  4: (PG::scrub_compare_maps()+0xd89) [0x7f3c39]
  5: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x1ee) [0x7f6d9e]
  6: (PG::scrub(ThreadPool::TPHandle&)+0x2ee) [0x7f870e]
  7: (OSD::ScrubWQ::_process(PG*, ThreadPool::TPHandle&)+0x19) 
[0x6cca59]
  8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) 
[0xbb3d1e]
  9: (ThreadPool::WorkThread::entry()+0x10) [0xbb4dc0]
  10: (()+0x8182) [0x7fefc0fb9182]
  11: (clone()+0x6d) [0x7fefbf52447d]
  NOTE: a copy of the executable, or `objdump -rdS 
<executable>` is needed
to interpret this.

root@ceph-admin:~# ceph -s
     cluster d960d672-e035-413d-ba39-8341f4131760
      health HEALTH_WARN
             3 pgs backfill
             2 pgs backfilling
             5 pgs stuck unclean
             2 requests are blocked > 32 sec
             recovery 363/10993480 objects degraded (0.003%)
             recovery 181602/10993480 objects misplaced (1.652%)
             pool libvirt-pool pg_num 512 > pgp_num 412
      monmap e1: 3 mons at
{ceph-mon1=10.20.0.11:6789/0,ceph-mon2=10.20.0.12:6789/0,ceph-mon3=10.20.0.13:6789/0} 

             election epoch 4780, quorum 0,1,2
ceph-mon1,ceph-mon2,ceph-mon3
      osdmap e141113: 46 osds: 42 up, 42 in; 5 remapped pgs
       pgmap v17289866: 1600 pgs, 4 pools, 20158 GB data, 5316 
kobjects
             42081 GB used, 32394 GB / 74476 GB avail
             363/10993480 objects degraded (0.003%)
             181602/10993480 objects misplaced (1.652%)
                 1595 active+clean
                    3 active+remapped+wait_backfill
                    2 active+remapped+backfilling
recovery io 83489 kB/s, 20 objects/s
   client io 13079 kB/s rd, 24569 kB/s wr, 113 op/s
root@ceph-admin:~#

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com