Re: ONE pg deep-scrub blocks cluster

c <ceph@xxxxxxxxxx> · Mon, 01 Aug 2016 17:30:46 +0200

Hello Guys,

your help is realy appreciated!

[...]
For the record, this ONLY happens with this PG and no others that
share
the same OSDs, right?

Yes, right.
[...]
When doing the deep-scrub, monitor (atop, etc) all 3 nodes and
see if a
particular OSD (HDD) stands out, as I would expect it to.

Now I logged all disks via atop each 2 seconds while the deep-scrub
was running ( atop -w osdXX_atop 2 ).
As you expected all disks was 100% busy - with constant 150MB
(osd.4), 130MB (osd.28) and 170MB (osd.16)...

- osd.4 (/dev/sdf) http://slexy.org/view/s21emd2u6j [1]
- osd.16 (/dev/sdm): http://slexy.org/view/s20vukWz5E [2]
- osd.28 (/dev/sdh): http://slexy.org/view/s20YX0lzZY [3]
[...]
But what is causing this? A deep-scrub on all other disks - same
model and ordered at the same time - seems to not have this issue.
[...]
Next week, I will do this

1.1 Remove osd.4 completely from Ceph - again (the actual primary
for PG 0.223)

osd.4 is now removed completely.
The Primary PG is now on "osd.9"

# ceph pg map 0.223
osdmap e8671 pg 0.223 (0.223) -> up [9,16,28] acting [9,16,28]

1.2 xfs_repair -n /dev/sdf1 (osd.4): to see possible error

xfs_repair did not show any error

1.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"

Cause now osd.9 is the Primary PG i have set the debug_osd on this too:
ceph tell osd.9 injectargs "--debug_osd 5/5"

and run the deep-scrub on 0.223 (and againg nearly all of my VMs stop 
working for a while)
Start @ 15:33:27
End @ 15:48:31

The "ceph.log"
- http://slexy.org/view/s2WbdApDLz

The related LogFiles (OSDs 9,16 and 28) and the LogFile via atop for the 
osds

LogFile - osd.9 (/dev/sdk)
- ceph-osd.9.log: http://slexy.org/view/s2kXeLMQyw
- atop Log: http://slexy.org/view/s21wJG2qr8

LogFile - osd.16 (/dev/sdm)
- ceph-osd.16.log: http://slexy.org/view/s20D6WhD4d
- atop Log: http://slexy.org/view/s2iMjer8rC

LogFile - osd.28 (/dev/sdh)
- ceph-osd.28.log: http://slexy.org/view/s21dmXoEo7
- atop log: http://slexy.org/view/s2gJqzu3uG

2.1 Remove osd.16 completely from Ceph
2.2 xfs_repair -n /dev/sdm1
2.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"

Tommorow i will remove osd.16 in addition to osd4 and do the same.
Acting set for pg 0.223: 9, ?, 28

3.1 Remove osd.28 completely from Ceph
3.2 xfs_repair -n /dev/sdm1
3.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"

After that, osd.28 will follow.
acting set for pg 0.223 then: 9,?,?

When my VMs stops for while even the mentioned disks before (4,16,28) 
are not in the cluster anymore, then there must be in issue with this 
pg!

smartctl may not show anything out of sorts until the marginally
bad sector or sectors finally goes bad and gets remapped.  The
only hint may be buried in the raw read error rate, seek error
rate or other error counts like ecc or crc errors.  The long test
you are running may or may not show any new information.

The long smartctl checks did not find any issues.

Perhaps is it notable that i have set the tunables to "jewel" since 
installation.
The flag "sort bitwise" is also set, cause this is the default for an 
Jewel installation.

My next mail will follow tommorow.

Did you guys find something in the attached logs which i did not see?

- Mehmet

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com