> Op 13 februari 2017 om 16:03 schreef Eugen Block <eblock@xxxxxx>: > > > Hi experts, > > I have a strange situation right now. We are re-organizing our 4 node > Hammer cluster from LVM-based OSDs to HDDs. When we did this on the > first node last week, everything went smoothly, I removed the OSDs > from the crush map and the rebalancing and recovery finished > successfully. > This weekend we did the same with the second node, we created the > HDD-based OSDs and added them to the cluster, waited for rebalancing > to finish and then stopped the old OSDs. Only this time the recovery > didn't completely finish, 4 PGs kept stuck unclean. I found out that 3 > of these 4 PGs had their primary OSD on that node. So I restarted the > respective services and those 3 PGs recovered successfully. But there > is one last PG that gives me headaches. > > ceph@ndesan01:~ # ceph pg map 1.3d3 > osdmap e24320 pg 1.3d3 (1.3d3) -> up [16,21] acting [16,21,0] > What version of Ceph? And could it be that the cluster has old CRUSH tunables? When was it installed with which Ceph version? Wido > ceph@ndesan01:~/ceph-deploy> ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 9.38985 root default > -2 1.19995 host ndesan01 > 0 0.23999 osd.0 up 1.00000 1.00000 > 1 0.23999 osd.1 up 1.00000 1.00000 > 2 0.23999 osd.2 up 1.00000 1.00000 > 13 0.23999 osd.13 up 1.00000 1.00000 > 19 0.23999 osd.19 up 1.00000 1.00000 > -3 1.81998 host ndesan02 > 3 0 osd.3 down 0 1.00000 > 4 0 osd.4 down 0 1.00000 > 5 0 osd.5 down 0 1.00000 > 9 0 osd.9 down 1.00000 1.00000 > 10 0 osd.10 down 1.00000 1.00000 > 6 0.90999 osd.6 up 1.00000 1.00000 > 7 0.90999 osd.7 up 1.00000 1.00000 > -4 1.81998 host nde32 > 20 0.90999 osd.20 up 1.00000 1.00000 > 21 0.90999 osd.21 up 1.00000 1.00000 > -5 4.54994 host ndesan03 > 14 0.90999 osd.14 up 1.00000 1.00000 > 15 0.90999 osd.15 up 1.00000 1.00000 > 16 0.90999 osd.16 up 1.00000 1.00000 > 17 0.90999 osd.17 up 1.00000 1.00000 > 18 0.90999 osd.18 up 1.00000 1.00000 > > > All OSDs marked as "down" are going to be removed. I looked for that > PG on all 3 nodes, and all of them have it. All services are up and > running, but for some reason this PG is not aware of that. Is there > any reasonable explanation and/or some advice how to get that PG > recovered? > > One thing I noticed: > > The data on the primary OSD (osd.16) had different timestamps than on > the other two OSDs: > > ---cut here--- > ndesan03:~ # ls -rtl /var/lib/ceph/osd/ceph-16/current/1.3d3_head/ > total 389436 > -rw-r--r-- 1 root root 0 Jul 12 2016 __head_000003D3__1 > ... > -rw-r--r-- 1 root root 0 Jan 9 10:43 > rbd\udata.bca465368d6b49.0000000000000a06__head_20EFF3D3__1 > -rw-r--r-- 1 root root 0 Jan 9 10:43 > rbd\udata.bca465368d6b49.0000000000000a8b__head_A014F3D3__1 > -rw-r--r-- 1 root root 0 Jan 9 10:44 > rbd\udata.bca465368d6b49.0000000000000e2c__head_00F2D3D3__1 > -rw-r--r-- 1 root root 0 Jan 9 10:44 > rbd\udata.bca465368d6b49.0000000000000e6a__head_C91813D3__1 > -rw-r--r-- 1 root root 8388608 Jan 20 13:53 > rbd\udata.cc94344e6afb66.00000000000008cb__head_6AA4B3D3__1 > -rw-r--r-- 1 root root 8388608 Jan 20 14:47 > rbd\udata.e15aee238e1f29.00000000000005f0__head_C95063D3__1 > -rw-r--r-- 1 root root 8388608 Jan 20 15:10 > rbd\udata.e15aee238e1f29.0000000000000d15__head_FF1083D3__1 > -rw-r--r-- 1 root root 8388608 Jan 20 15:19 > rbd\udata.e15aee238e1f29.000000000000100c__head_6B17F3D3__1 > -rw-r--r-- 1 root root 8388608 Jan 23 14:17 > rbd\udata.e73cf7b03e0c6.0000000000000479__head_C16003D3__1 > -rw-r--r-- 1 root root 8388608 Jan 25 11:52 > rbd\udata.d4edc95e884adc.00000000000000f4__head_00EE43D3__1 > -rw-r--r-- 1 root root 4194304 Jan 27 08:07 > rbd\udata.34595be2237e6.0000000000000ad5__head_D3CC93D3__1 > -rw-r--r-- 1 root root 4194304 Jan 27 08:08 > rbd\udata.34595be2237e6.0000000000000aff__head_3BF633D3__1 > -rw-r--r-- 1 root root 4194304 Jan 27 16:20 > rbd\udata.8b61c69f34baf.000000000000876a__head_A60A63D3__1 > -rw-r--r-- 1 root root 4194304 Jan 29 17:45 > rbd\udata.28fcaf199543c3.0000000000000ae7__head_C1BA53D3__1 > -rw-r--r-- 1 root root 4194304 Jan 30 06:33 > rbd\udata.28fcaf199543c3.0000000000001832__head_6EC113D3__1 > -rw-r--r-- 1 root root 4194304 Jan 31 10:33 > rb.0.ddcdf5.238e1f29.0000000000e4__head_3F1543D3__1 > -rw-r--r-- 1 root root 4194304 Feb 13 06:14 > rbd\udata.856071751c29d.000000000000617b__head_E1E4A3D3__1 > ---cut here--- > > The other two OSDs have identical timestamps, I just post the > (shortened) output of osd.21: > > ---cut here--- > nde32:/var/lib/ceph/osd/ceph-21/current # ls -lrt > /var/lib/ceph/osd/ceph-21/current/1.3d3_head/ > total 389432 > -rw-r--r-- 1 root root 0 Feb 6 15:29 __head_000003D3__1 > ... > -rw-r--r-- 1 root root 0 Feb 6 16:46 > rbd\udata.a00851d652069.00000000000007a4__head_C55DB3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.947feb21a163a2.0000000000004349__head_A37FB3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.8b61c69f34baf.00000000000068cb__head_B4A2C3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.874a620334da.00000000000004ed__head_3835C3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.8b61c69f34baf.0000000000004424__head_5BA7C3D3__1 > -rw-r--r-- 1 root root 8388608 Feb 6 16:47 > rbd\udata.31a3e57d64476.0000000000000418__head_B158C3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.1128db1b5d2111.00000000000002eb__head_81AAC3D3__1 > -rw-r--r-- 1 root root 0 Feb 6 16:47 > rbd\udata.bca465368d6b49.0000000000000e2c__head_00F2D3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.2d6fe91cf37a46.000000000000019e__head_2346D3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.856071751c29d.0000000000006134__head_C876E3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.949da61c92b32c.0000000000000a18__head_397BE3D3__1 > -rw-r--r-- 1 root root 8388608 Feb 6 16:47 > rbd\udata.567d57d819eed.000000000000034f__head_FC83F3D3__1 > -rw-r--r-- 1 root root 0 Feb 6 16:47 > rbd\udata.bca465368d6b49.0000000000000a8b__head_A014F3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 6 16:47 > rbd\udata.856071751c29d.0000000000003a2c__head_0684F3D3__1 > -rw-r--r-- 1 root root 8388608 Feb 6 16:47 > rbd\udata.e15aee238e1f29.000000000000100c__head_6B17F3D3__1 > -rw-r--r-- 1 root root 0 Feb 6 16:47 > rbd\udata.bca465368d6b49.0000000000000a06__head_20EFF3D3__1 > -rw-r--r-- 1 root root 4194304 Feb 13 06:14 > rbd\udata.856071751c29d.000000000000617b__head_E1E4A3D3__1 > ---cut here--- > > So I figured that the data on the primary OSD could be the problem and > copied the content from one of the other OSDs, restarted all 3 OSDs, > but the status didn't change. How can I repair this PG? > > Another question about OSD replacement: why didn't the cluster switch > the primary OSD for all PGs when the OSDs went down? If this was a > real disk failure, I have doubts about a full recovery. Or should I > have deleted that PG instead of re-activating old OSDs? I'm not sure > what the best practice would be in this case. > > Any help is appreciated! > > Regards, > Eugen > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : eblock@xxxxxx > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen > Sitz und Registergericht: Hamburg, HRB 90934 > Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com