Ah, this is kind of silly. I think you don't have 37 errors, but 2 errors. pg 2.490 object 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2 is missing snap 141. If you look at the objects after that in the log: 2015-08-20 20:15:44.865670 osd.19 10.12.2.6:6838/1861727 298 : cluster [ERR] repair 2.490 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/head//2 expected clone 2d7b9490/rbd_data.18f92c3d1b58ba.0000000000006167/141//2 2015-08-20 20:15:44.865817 osd.19 10.12.2.6:6838/1861727 299 : cluster [ERR] repair 2.490 ded49490/rbd_data.11a25c7934d3d4.0000000000008a8a/head//2 expected clone 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/141//2 The clone from the second line matches the head object from the previous line, and they have the same clone id. I *think* that the first error is real, and the subsequent ones are just scrub being dumb. Same deal with pg 2.c4. I just opened http://tracker.ceph.com/issues/12738. The original problem is that 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2 and 22ca30c4/rbd_data.e846e25a70bf7.0000000000000307/snapdir//2 are both missing a clone. Not sure how that happened, my money is on a cache/tiering evict racing with a snap trim. If you have any logging or relevant information from when that happened, you should open a bug. The 'snapdir' in the two object names indicates that the head object has actually been deleted (which makes sense if you moved the image to a new image and deleted the old one) and is only being kept around since there are live snapshots. I suggest you leave the snapshots for those images alone for the time being -- removing them might cause the osd to crash trying to clean up the wierd on disk state. Other than the leaked space from those two image snapshots and the annoying spurious scrub errors, I think no actual corruption is going on though. I created a tracker ticket for a feature that would let ceph-objectstore-tool remove the spurious clone from the head/snapdir metadata. Am I right that you haven't actually seen any osd crashes or user visible corruption (except possibly on snapshots of those two images)? -Sam On Thu, Aug 20, 2015 at 10:07 AM, Voloshanenko Igor <igor.voloshanenko@xxxxxxxxx> wrote: > Inktank: > https://download.inktank.com/docs/ICE%201.2%20-%20Cache%20and%20Erasure%20Coding%20FAQ.pdf > > Mail-list: > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg18338.html > > 2015-08-20 20:06 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> Which docs? >> -Sam >> >> On Thu, Aug 20, 2015 at 9:57 AM, Voloshanenko Igor >> <igor.voloshanenko@xxxxxxxxx> wrote: >> > Not yet. I will create. >> > But according to mail lists and Inktank docs - it's expected behaviour >> > when >> > cache enable >> > >> > 2015-08-20 19:56 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> Is there a bug for this in the tracker? >> >> -Sam >> >> >> >> On Thu, Aug 20, 2015 at 9:54 AM, Voloshanenko Igor >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> > Issue, that in forward mode, fstrim doesn't work proper, and when we >> >> > take >> >> > snapshot - data not proper update in cache layer, and client (ceph) >> >> > see >> >> > damaged snap.. As headers requested from cache layer. >> >> > >> >> > 2015-08-20 19:53 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >> >> What was the issue? >> >> >> -Sam >> >> >> >> >> >> On Thu, Aug 20, 2015 at 9:41 AM, Voloshanenko Igor >> >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> > Samuel, we turned off cache layer few hours ago... >> >> >> > I will post ceph.log in few minutes >> >> >> > >> >> >> > For snap - we found issue, was connected with cache tier.. >> >> >> > >> >> >> > 2015-08-20 19:23 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >> >> >> >> Ok, you appear to be using a replicated cache tier in front of a >> >> >> >> replicated base tier. Please scrub both inconsistent pgs and >> >> >> >> post >> >> >> >> the >> >> >> >> ceph.log from before when you started the scrub until after. >> >> >> >> Also, >> >> >> >> what command are you using to take snapshots? >> >> >> >> -Sam >> >> >> >> >> >> >> >> On Thu, Aug 20, 2015 at 3:59 AM, Voloshanenko Igor >> >> >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> >> > Hi Samuel, we try to fix it in trick way. >> >> >> >> > >> >> >> >> > we check all rbd_data chunks from logs (OSD) which are >> >> >> >> > affected, >> >> >> >> > then >> >> >> >> > query >> >> >> >> > rbd info to compare which rbd consist bad rbd_data, after that >> >> >> >> > we >> >> >> >> > mount >> >> >> >> > this >> >> >> >> > rbd as rbd0, create empty rbd, and DD all info from bad volume >> >> >> >> > to >> >> >> >> > new >> >> >> >> > one. >> >> >> >> > >> >> >> >> > But after that - scrub errors growing... Was 15 errors.. .Now >> >> >> >> > 35... >> >> >> >> > We >> >> >> >> > laos >> >> >> >> > try to out OSD which was lead, but after rebalancing this 2 pgs >> >> >> >> > still >> >> >> >> > have >> >> >> >> > 35 scrub errors... >> >> >> >> > >> >> >> >> > ceph osd getmap -o <outfile> - attached >> >> >> >> > >> >> >> >> > >> >> >> >> > 2015-08-18 18:48 GMT+03:00 Samuel Just <sjust@xxxxxxxxxx>: >> >> >> >> >> >> >> >> >> >> Is the number of inconsistent objects growing? Can you attach >> >> >> >> >> the >> >> >> >> >> whole ceph.log from the 6 hours before and after the snippet >> >> >> >> >> you >> >> >> >> >> linked above? Are you using cache/tiering? Can you attach >> >> >> >> >> the >> >> >> >> >> osdmap >> >> >> >> >> (ceph osd getmap -o <outfile>)? >> >> >> >> >> -Sam >> >> >> >> >> >> >> >> >> >> On Tue, Aug 18, 2015 at 4:15 AM, Voloshanenko Igor >> >> >> >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> >> >> > ceph - 0.94.2 >> >> >> >> >> > Its happen during rebalancing >> >> >> >> >> > >> >> >> >> >> > I thought too, that some OSD miss copy, but looks like all >> >> >> >> >> > miss... >> >> >> >> >> > So any advice in which direction i need to go >> >> >> >> >> > >> >> >> >> >> > 2015-08-18 14:14 GMT+03:00 Gregory Farnum >> >> >> >> >> > <gfarnum@xxxxxxxxxx>: >> >> >> >> >> >> >> >> >> >> >> >> From a quick peek it looks like some of the OSDs are >> >> >> >> >> >> missing >> >> >> >> >> >> clones >> >> >> >> >> >> of >> >> >> >> >> >> objects. I'm not sure how that could happen and I'd expect >> >> >> >> >> >> the >> >> >> >> >> >> pg >> >> >> >> >> >> repair to handle that but if it's not there's probably >> >> >> >> >> >> something >> >> >> >> >> >> wrong; what version of Ceph are you running? Sam, is this >> >> >> >> >> >> something >> >> >> >> >> >> you've seen, a new bug, or some kind of config issue? >> >> >> >> >> >> -Greg >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Aug 18, 2015 at 6:27 AM, Voloshanenko Igor >> >> >> >> >> >> <igor.voloshanenko@xxxxxxxxx> wrote: >> >> >> >> >> >> > Hi all, at our production cluster, due high rebalancing >> >> >> >> >> >> > ((( >> >> >> >> >> >> > we >> >> >> >> >> >> > have 2 >> >> >> >> >> >> > pgs in >> >> >> >> >> >> > inconsistent state... >> >> >> >> >> >> > >> >> >> >> >> >> > root@temp:~# ceph health detail | grep inc >> >> >> >> >> >> > HEALTH_ERR 2 pgs inconsistent; 18 scrub errors >> >> >> >> >> >> > pg 2.490 is active+clean+inconsistent, acting [56,15,29] >> >> >> >> >> >> > pg 2.c4 is active+clean+inconsistent, acting [56,10,42] >> >> >> >> >> >> > >> >> >> >> >> >> > From OSD logs, after recovery attempt: >> >> >> >> >> >> > >> >> >> >> >> >> > root@test:~# ceph pg dump | grep -i incons | cut -f 1 | >> >> >> >> >> >> > while >> >> >> >> >> >> > read >> >> >> >> >> >> > i; >> >> >> >> >> >> > do >> >> >> >> >> >> > ceph pg repair ${i} ; done >> >> >> >> >> >> > dumped all in format plain >> >> >> >> >> >> > instructing pg 2.490 on osd.56 to repair >> >> >> >> >> >> > instructing pg 2.c4 on osd.56 to repair >> >> >> >> >> >> > >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:51:2015-08-18 >> >> >> >> >> >> > 07:26:37.035910 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > f5759490/rbd_data.1631755377d7e.00000000000004da/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > 90c59490/rbd_data.eb486436f2beb.0000000000007a65/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:52:2015-08-18 >> >> >> >> >> >> > 07:26:37.035960 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > fee49490/rbd_data.12483d3ba0794b.000000000000522f/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > f5759490/rbd_data.1631755377d7e.00000000000004da/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:53:2015-08-18 >> >> >> >> >> >> > 07:26:37.036133 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > a9b39490/rbd_data.12483d3ba0794b.00000000000037b3/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > fee49490/rbd_data.12483d3ba0794b.000000000000522f/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:54:2015-08-18 >> >> >> >> >> >> > 07:26:37.036243 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > bac19490/rbd_data.1238e82ae8944a.000000000000032e/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > a9b39490/rbd_data.12483d3ba0794b.00000000000037b3/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:55:2015-08-18 >> >> >> >> >> >> > 07:26:37.036289 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > 98519490/rbd_data.123e9c2ae8944a.0000000000000807/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > bac19490/rbd_data.1238e82ae8944a.000000000000032e/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:56:2015-08-18 >> >> >> >> >> >> > 07:26:37.036314 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > 98519490/rbd_data.123e9c2ae8944a.0000000000000807/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:57:2015-08-18 >> >> >> >> >> >> > 07:26:37.036363 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > 28809490/rbd_data.edea7460fe42b.00000000000001d9/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:58:2015-08-18 >> >> >> >> >> >> > 07:26:37.036432 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490 >> >> >> >> >> >> > e1509490/rbd_data.1423897545e146.00000000000009a6/head//2 >> >> >> >> >> >> > expected >> >> >> >> >> >> > clone >> >> >> >> >> >> > 28809490/rbd_data.edea7460fe42b.00000000000001d9/141//2 >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:59:2015-08-18 >> >> >> >> >> >> > 07:26:38.548765 >> >> >> >> >> >> > 7f94663b3700 >> >> >> >> >> >> > -1 >> >> >> >> >> >> > log_channel(cluster) log [ERR] : 2.490 deep-scrub 17 >> >> >> >> >> >> > errors >> >> >> >> >> >> > >> >> >> >> >> >> > So, how i can solve "expected clone" situation by hand? >> >> >> >> >> >> > Thank in advance! >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > _______________________________________________ >> >> >> >> >> >> > ceph-users mailing list >> >> >> >> >> >> > ceph-users@xxxxxxxxxxxxxx >> >> >> >> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com