Re: active+clean+inconsistent and pg repair

Shain Miley <SMiley@xxxxxxx> · Tue, 21 Mar 2017 23:32:01 +0000

Hi,

Thank you for providing me this level of detail.

I ended up just failing the drive since it is still under support and we had in fact gotten emails about the health of this drive in the past.

I will however use this in the future if we have an issue with a pg and it is the first time  we have had an issue with the drive and/or it's not still under support.

Thanks again.

Shain

> On Mar 19, 2017, at 11:19 AM, Mehmet <ceph@xxxxxxxxxx> wrote:
> 
> Hi Shain,
> 
> what i would do:
> take the osd.32 out
> 
> # systemctl stop ceph-osd@32
> # ceph osd out osd.32
> 
> this will cause rebalancing.
> 
> to repair/reuse the drive you can do:
> 
> # smartctl -t long /dev/sdX
> This will start a long self-test on the drive and - i bet - abort this after a while with somethin like
> 
> # smartctl -a /dev/sdX
> [...]
> SMART Self-test log
> 
> Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
> 
>     Description                              number   (hours)
> 
> # 1  Background long   Failed in segment -->       -    4378          35494670 [0x3 0x11 0x0]
> [...]
> 
> 
> Now mark the segmant as "malfunction" - my system was Ubuntu
> 
> # apt install sg3-utils/xenial
> # sg_verify --lba=35494670 /dev/sdX1
> # sg_reassign --address=35494670 /dev/sdX
> # sg_reassign --grown /dev/sdX
> 
> the next long test should hopefully work fine:
> # smartctl -t long /dev/sdX
> 
> If not repeat the above with new found defekt lba.
> 
> Ive done this three time successfully - but not with an error on a primary pg.
> 
> After that you can start the osd with
> 
> # systemctl start ceph-osd@32
> # ceph osd in osd.32
> 
> HTH
> - Mehmet
> 
> 
> Am 2017-03-17 20:08, schrieb Shain Miley:
>> Brian,
>> Thank you for the detailed information.  I was able to compare the 3
>> hexdump files and it looks like the primary pg is the odd man out.
>> I stopped the OSD and then I attempted to move the object:
>> root@hqosd3:/var/lib/ceph/osd/ceph-32/current/3.2b8_head/DIR_8/DIR_B/DIR_2/DIR_A/DIR_0#
>> mv rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3 /root
>> mv: error reading
>> ‘rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3’:
>> Input/output error
>> mv: failed to extend
>> ‘/root/rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3’:
>> Input/output error
>> However I got a nice Input/output error instead.
>> I assume that this is not the case normally.
>> Any ideas on how I should proceed at this point..should I fail out
>> this OSD and replace the drive (I have had no indication other than
>> the IO error that there is an issue with this disk), or is there
>> something I can try first?
>> Thanks again,
>> Shain
>>> On 03/17/2017 11:38 AM, Brian Andrus wrote:
>>> We went through a period of time where we were experiencing these
>>> daily...
>>> cd to the PG directory on each OSD and do a find for
>>> "238e1f29.00000076024c" (mentioned in your error message). This will
>>> likely return a file that has a slash in the name, something like
>>> rbdudata.238e1f29.00000076024c_head_blah_1f...
>>> hexdump -C the object (tab completing the name helps) and pipe the
>>> output to a different location. Once you obtain the hexdumps, do a
>>> diff or cmp against them and find which one is not like the others.
>>> If the primary is not the outlier, perform the PG repair without
>>> worry. If the primary is the outlier, you will need to stop the OSD,
>>> move the object out of place, start it back up and then it will be
>>> okay to issue a PG repair.
>>> Other less common inconsistent PGs we see are differing object sizes
>>> (easy to detect with a simple list of file size) and differing
>>> attributes ("attr -l", but the error logs are usually precise in
>>> identifying the problematic PG copy).
>>>> On Fri, Mar 17, 2017 at 8:16 AM, Shain Miley <smiley@xxxxxxx> wrote:
>>>> Hello,
>>>> Ceph status is showing:
>>>> 1 pgs inconsistent
>>>> 1 scrub errors
>>>> 1 active+clean+inconsistent
>>>> I located the error messages in the logfile after querying the pg
>>>> in question:
>>>> root@hqosd3:/var/log/ceph# zgrep -Hn 'ERR' ceph-osd.32.log.1.gz
>>>> ceph-osd.32.log.1.gz:846:2017-03-17 02:25:20.281608 7f7744d7f700
>>>> -1 log_channel(cluster) log [ERR] : 3.2b8 shard 32: soid
>>>> 3/4650a2b8/rb.0.fe307e.238e1f29.00000076024c/head candidate had a
>>>> read error, data_digest 0x84c33490 != known data_digest 0x974a24a7
>>>> from auth shard
>> 62                                                                                                        
>>>> ceph-osd.32.log.1.gz:847:2017-03-17 02:30:40.264219 7f7744d7f700
>>>> -1 log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 0 missing, 1
>>>> inconsistent
>> objects                                     
>>>> ceph-osd.32.log.1.gz:848:2017-03-17 02:30:40.264307 7f7744d7f700
>>>> -1 log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 1 errors
>>>> Is this a case where it would be safe to use 'ceph pg repair'? The
>>>> documentation indicates there are times where running this command
>>>> is less safe than others...and I would like to be sure before I do
>>>> so.
>>>> Thanks,
>>>> Shain
>>>> --
>>>> NPR | Shain Miley | Manager of Infrastructure, Digital Media |
>>>> smiley@xxxxxxx | 202.513.3649 [1]
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2]
>>> --
>>> Brian Andrus | Cloud Systems Engineer | DreamHost
>>> brian.andrus@xxxxxxxxxxxxx | www.dreamhost.com [3]
>> --
>> NPR | Shain Miley | Manager of Infrastructure, Digital Media |
>> smiley@xxxxxxx | 202.513.3649
>> Links:
>> ------
>> [1] tel:%28202%29%20513-3649
>> [2] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> [3] http://www.dreamhost.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com