Re: inconsistent PG -> unfound objects on an erasure coded system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oh, it's getting a stat mismatch.  I think what happened is that on
one of the earlier repairs it reset the stats to the wrong value (the
orphan was causing the primary to scan two objects twice, which
matches the stat mismatch I see here).  A pg repair repair will clear
that up.
-Sam

On Thu, Mar 17, 2016 at 9:22 AM, Jeffrey McDonald <jmcdonal@xxxxxxx> wrote:
> Thanks Sam.....
>
> Since I have prepared a script for this, I decided to go ahead with the
> checks.....(patience isn't one of my extended attributes....)
>
> I've got a file that searches the full erasure encoded spaces and does your
> checklist below.   I have operated only on one PG so far, the 70.459 one
> that we've been discussing.    There was only the one file that I found to
> be out of place--the one we already discussed/found and it has been removed.
>
> The pg is still marked as inconsistent.   I've scrubbed it a couple of times
> now and what I've seen is:
>
> 2016-03-17 09:29:53.202818 7f2e816f8700  0 log_channel(cluster) log [INF] :
> 70.459 deep-scrub starts
> 2016-03-17 09:36:38.436821 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
> 70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones,
> 22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
> 68440088914/68445454633 bytes,0/0 hit_set_archive bytes.
> 2016-03-17 09:36:38.436844 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
> 70.459 deep-scrub 1 errors
> 2016-03-17 09:44:23.592302 7f2e816f8700  0 log_channel(cluster) log [INF] :
> 70.459 deep-scrub starts
> 2016-03-17 09:47:01.237846 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
> 70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones,
> 22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
> 68440088914/68445454633 bytes,0/0 hit_set_archive bytes.
> 2016-03-17 09:47:01.237880 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
> 70.459 deep-scrub 1 errors
>
>
> Should the scrub be sufficient to remove the inconsistent flag?   I took the
> osd offline during the repairs.    I've looked at files in all of the osds
> in the placement group and I'm not finding any more problem files.    The
> vast majority of files do not have the user.cephos.lfn3 attribute.    There
> are 22321 objects that I seen and only about 230 have the user.cephos.lfn3
> file attribute.   The files will have other attributes, just not
> user.cephos.lfn3.
>
> Regards,
> Jeff
>
>
> On Wed, Mar 16, 2016 at 3:53 PM, Samuel Just <sjust@xxxxxxxxxx> wrote:
>>
>> Ok, like I said, most files with _long at the end are *not orphaned*.
>> The generation number also is *not* an indication of whether the file
>> is orphaned -- some of the orphaned files will have ffffffffffffffff
>> as the generation number and others won't.  For each long filename
>> object in a pg you would have to:
>> 1) Pull the long name out of the attr
>> 2) Parse the hash out of the long name
>> 3) Turn that into a directory path
>> 4) Determine whether the file is at the right place in the path
>> 5) If not, remove it (or echo it to be checked)
>>
>> You probably want to wait for someone to get around to writing a
>> branch for ceph-objectstore-tool.  Should happen in the next week or
>> two.
>> -Sam
>>
>
> --
>
> Jeffrey McDonald, PhD
> Assistant Director for HPC Operations
> Minnesota Supercomputing Institute
> University of Minnesota Twin Cities
> 599 Walter Library           email: jeffrey.mcdonald@xxxxxxxxxxx
> 117 Pleasant St SE           phone: +1 612 625-6905
> Minneapolis, MN 55455        fax:   +1 612 624-8861
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux