Re: Ceph pg repair clone_missing?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 3, 2019 at 6:46 PM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote:
>
>  >
>  >>
>  >> I was following the thread where you adviced on this pg repair
>  >>
>  >> I ran these rados 'list-inconsistent-obj'/'rados
>  >> list-inconsistent-snapset' and have output on the snapset. I tried
> to
>  >> extrapolate your comment on the data/omap_digest_mismatch_info onto
> my
>  >> situation. But I don't know how to proceed. I got on this mailing
> list
>  >> the advice to delete snapshot 4, but if I see this output, that
> might
>  >> not have been the smartest thing to do.
>  >
>  >That remains to be seen. Can you post the actual scrub error you are
> getting?
>
> 2019-10-03 09:27:07.831046 7fc448bf6700 -1 log_channel(cluster) log
> [ERR] : deep-scrub 17.36
> 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:head : expected
> clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0000000000000974:4 1 missing

Try something like the following on each OSD that holds a copy of
rbd_data.1f114174b0dc51.0000000000000974 and see what output you get.
Note that you can drop the bluestore flag if they are not bluestore
osds and you will need the osd stopped at the time (set noout). Also
note, snapids are displayed in hexadecimal in the output (but then '4'
is '4' so not a big issues here).

$ ceph-objectstore-tool --type bluestore --data-path
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
rbd_data.1f114174b0dc51.0000000000000974

The likely issue here is the primary believes snapshot 4 is gone but
there is still data and/or metadata on one of the replicas which is
confusing the issue. If that is the case you can use the the
ceph-objectstore-tool to delete the relevant snapshot(s)

>  >>
>  >>
>  >>
>  >>
>  >> [0]
>  >> http://tracker.ceph.com/issues/24994
>  >
>  >At first glance this appears to be a different issue to yours.
>  >
>  >>
>  >> [1]
>  >> {
>  >>   "epoch": 66082,
>  >>   "inconsistents": [
>  >>     {
>  >>       "name": "rbd_data.1f114174b0dc51.0000000000000974",
>  >
>  >rbd_data.1f114174b0dc51 is the block_name_prefix for this image. You
>  >can run 'rbd info' on the images in this pool to see which image is
>  >actually affected and how important the data is.
>
> Yes I know what image it is. Deleting data is easy, I like to know/learn

I wasn't suggesting you just delete it. I merely suggested you be
informed about what data you are manipulating so you can proceed
appropriately.

>
> how to fix this.
>
>  >
>  >>       "nspace": "",
>  >>       "locator": "",
>  >>       "snap": "head",
>  >>       "snapset": {
>  >>         "snap_context": {
>  >>           "seq": 63,
>  >>           "snaps": [
>  >>             63,
>  >>             35,
>  >>             13,
>  >>             4
>  >>           ]
>  >>         },
>  >>         "head_exists": 1,
>  >>         "clones": [
>  >>           {
>  >>             "snap": 4,
>  >>             "size": 4194304,
>  >>             "overlap": "[]",
>  >>             "snaps": [
>  >>               4
>  >>             ]
>  >>           },
>  >>           {
>  >>             "snap": 63,
>  >>             "size": 4194304,
>  >>             "overlap": "[0~4194304]",
>  >>             "snaps": [
>  >>               63,
>  >>               35,
>  >>               13
>  >>             ]
>  >>           }
>  >>         ]
>  >>       },
>  >>       "errors": [
>  >>         "clone_missing"
>  >>       ],
>  >>       "missing": [
>  >>         4
>  >>       ]
>  >>     }
>  >>   ]
>  >> }
>  >
>  >



-- 
Cheers,
Brad

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux