Oh, it's getting a stat mismatch. I think what happened is that on one of the earlier repairs it reset the stats to the wrong value (the orphan was causing the primary to scan two objects twice, which matches the stat mismatch I see here). A pg repair repair will clear that up. -Sam On Thu, Mar 17, 2016 at 9:22 AM, Jeffrey McDonald <jmcdonal@xxxxxxx> wrote: > Thanks Sam..... > > Since I have prepared a script for this, I decided to go ahead with the > checks.....(patience isn't one of my extended attributes....) > > I've got a file that searches the full erasure encoded spaces and does your > checklist below. I have operated only on one PG so far, the 70.459 one > that we've been discussing. There was only the one file that I found to > be out of place--the one we already discussed/found and it has been removed. > > The pg is still marked as inconsistent. I've scrubbed it a couple of times > now and what I've seen is: > > 2016-03-17 09:29:53.202818 7f2e816f8700 0 log_channel(cluster) log [INF] : > 70.459 deep-scrub starts > 2016-03-17 09:36:38.436821 7f2e816f8700 -1 log_channel(cluster) log [ERR] : > 70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones, > 22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, > 68440088914/68445454633 bytes,0/0 hit_set_archive bytes. > 2016-03-17 09:36:38.436844 7f2e816f8700 -1 log_channel(cluster) log [ERR] : > 70.459 deep-scrub 1 errors > 2016-03-17 09:44:23.592302 7f2e816f8700 0 log_channel(cluster) log [INF] : > 70.459 deep-scrub starts > 2016-03-17 09:47:01.237846 7f2e816f8700 -1 log_channel(cluster) log [ERR] : > 70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones, > 22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts, > 68440088914/68445454633 bytes,0/0 hit_set_archive bytes. > 2016-03-17 09:47:01.237880 7f2e816f8700 -1 log_channel(cluster) log [ERR] : > 70.459 deep-scrub 1 errors > > > Should the scrub be sufficient to remove the inconsistent flag? I took the > osd offline during the repairs. I've looked at files in all of the osds > in the placement group and I'm not finding any more problem files. The > vast majority of files do not have the user.cephos.lfn3 attribute. There > are 22321 objects that I seen and only about 230 have the user.cephos.lfn3 > file attribute. The files will have other attributes, just not > user.cephos.lfn3. > > Regards, > Jeff > > > On Wed, Mar 16, 2016 at 3:53 PM, Samuel Just <sjust@xxxxxxxxxx> wrote: >> >> Ok, like I said, most files with _long at the end are *not orphaned*. >> The generation number also is *not* an indication of whether the file >> is orphaned -- some of the orphaned files will have ffffffffffffffff >> as the generation number and others won't. For each long filename >> object in a pg you would have to: >> 1) Pull the long name out of the attr >> 2) Parse the hash out of the long name >> 3) Turn that into a directory path >> 4) Determine whether the file is at the right place in the path >> 5) If not, remove it (or echo it to be checked) >> >> You probably want to wait for someone to get around to writing a >> branch for ceph-objectstore-tool. Should happen in the next week or >> two. >> -Sam >> > > -- > > Jeffrey McDonald, PhD > Assistant Director for HPC Operations > Minnesota Supercomputing Institute > University of Minnesota Twin Cities > 599 Walter Library email: jeffrey.mcdonald@xxxxxxxxxxx > 117 Pleasant St SE phone: +1 612 625-6905 > Minneapolis, MN 55455 fax: +1 612 624-8861 > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com