I think what happened is this :
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
Note
Sometimes, typically in a “small” cluster with few hosts (for instance with a small testing cluster), the fact to take out
the OSD can spawn a CRUSH corner case where some PGs remain stuck in the active+remapped
state
Its a small cluster with unequal number of osds and one of the OSD disk failed and I had taken it out.
I have already purged it, so I cannot use the reweight option mentioned in that link.
So any other workarounds ?
Will adding more disks will clear it ?
Karun Josy
On Mon, Dec 18, 2017 at 9:06 AM, David Turner <drakonstein@xxxxxxxxx> wrote:
Maybe try outing the disk that should have a copy of the PG, but doesn't. Then mark it back in. It might check that it has everything properly and pull a copy of the data it's missing. I dunno.
On Sun, Dec 17, 2017, 10:00 PM Karun Josy <karunjosy1@xxxxxxxxx> wrote:Tried restarting all osds. Still no luck.Will adding a new disk to any of the server forces a rebalance and fix it?Karun Josy______________________________On Sun, Dec 17, 2017 at 12:22 PM, Cary <dynamic.cary@xxxxxxxxx> wrote:Karun,
Could you paste in the output from "ceph health detail"? Which OSD
was just added?
Cary
-Dynamic
On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
> Any help would be appreciated!
>
> Karun Josy
>
> On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
>>
>> Hi,
>>
>> Repair didnt fix the issue.
>>
>> In the pg dump details, I notice this None. Seems pg is missing from one
>> of the OSD
>>
>> [0,2,NONE,4,12,10,5,1]
>> [0,2,1,4,12,10,5,1]
>>
>> There is no way Ceph corrects this automatically ? I have to edit/
>> troubleshoot it manually ?
>>
>> Karun
>>
>> On Sat, Dec 16, 2017 at 10:44 PM, Cary <dynamic.cary@xxxxxxxxx> wrote:
>>>
>>> Karun,
>>>
>>> Running ceph pg repair should not cause any problems. It may not fix
>>> the issue though. If that does not help, there is more information at
>>> the link below.
>>> http://ceph.com/geen-categorie/ceph-manually- repair-object/
>>>
>>> I recommend not rebooting, or restarting while Ceph is repairing or
>>> recovering. If possible, wait until the cluster is in a healthy state
>>> first.
>>>
>>> Cary
>>> -Dynamic
>>>
>>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy <karunjosy1@xxxxxxxxx> wrote:
>>> > Hi Cary,
>>> >
>>> > No, I didnt try to repair it.
>>> > I am comparatively new in ceph. Is it okay to try to repair it ?
>>> > Or should I take any precautions while doing it ?
>>> >
>>> > Karun Josy
>>> >
>>> > On Sat, Dec 16, 2017 at 2:08 PM, Cary <dynamic.cary@xxxxxxxxx> wrote:
>>> >>
>>> >> Karun,
>>> >>
>>> >> Did you attempt a "ceph pg repair <pgid>"? Replace <pgid> with the pg
>>> >> ID that needs repaired, 3.4.
>>> >>
>>> >> Cary
>>> >> -D123
>>> >>
>>> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy <karunjosy1@xxxxxxxxx>
>>> >> wrote:
>>> >> > Hello,
>>> >> >
>>> >> > I added 1 disk to the cluster and after rebalancing, it shows 1 PG
>>> >> > is in
>>> >> > remapped state. How can I correct it ?
>>> >> >
>>> >> > (I had to restart some osds during the rebalancing as there were
>>> >> > some
>>> >> > slow
>>> >> > requests)
>>> >> >
>>> >> > $ ceph pg dump | grep remapped
>>> >> > dumped all
>>> >> > 3.4 981 0 0 0 0
>>> >> > 2655009792
>>> >> > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964
>>> >> > 2824'785115
>>> >> > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 [0,2,1,4,12,10,5,1]
>>> >> > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 2017-12-08
>>> >> > 03:56:14.006982
>>> >> >
>>> >> > That PG belongs to an erasure pool with k=5, m =3 profile, failure
>>> >> > domain is
>>> >> > host.
>>> >> >
>>> >> > ===========
>>> >> >
>>> >> > $ ceph osd tree
>>> >> > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>>> >> > -1 16.94565 root default
>>> >> > -3 2.73788 host ceph-a1
>>> >> > 0 ssd 1.86469 osd.0 up 1.00000 1.00000
>>> >> > 14 ssd 0.87320 osd.14 up 1.00000 1.00000
>>> >> > -5 2.73788 host ceph-a2
>>> >> > 1 ssd 1.86469 osd.1 up 1.00000 1.00000
>>> >> > 15 ssd 0.87320 osd.15 up 1.00000 1.00000
>>> >> > -7 1.86469 host ceph-a3
>>> >> > 2 ssd 1.86469 osd.2 up 1.00000 1.00000
>>> >> > -9 1.74640 host ceph-a4
>>> >> > 3 ssd 0.87320 osd.3 up 1.00000 1.00000
>>> >> > 4 ssd 0.87320 osd.4 up 1.00000 1.00000
>>> >> > -11 1.74640 host ceph-a5
>>> >> > 5 ssd 0.87320 osd.5 up 1.00000 1.00000
>>> >> > 6 ssd 0.87320 osd.6 up 1.00000 1.00000
>>> >> > -13 1.74640 host ceph-a6
>>> >> > 7 ssd 0.87320 osd.7 up 1.00000 1.00000
>>> >> > 8 ssd 0.87320 osd.8 up 1.00000 1.00000
>>> >> > -15 1.74640 host ceph-a7
>>> >> > 9 ssd 0.87320 osd.9 up 1.00000 1.00000
>>> >> > 10 ssd 0.87320 osd.10 up 1.00000 1.00000
>>> >> > -17 2.61960 host ceph-a8
>>> >> > 11 ssd 0.87320 osd.11 up 1.00000 1.00000
>>> >> > 12 ssd 0.87320 osd.12 up 1.00000 1.00000
>>> >> > 13 ssd 0.87320 osd.13 up 1.00000 1.00000
>>> >> >
>>> >> >
>>> >> >
>>> >> > Karun
>>> >> >
>>> >> > _______________________________________________
>>> >> > ceph-users mailing list
>>> >> > ceph-users@xxxxxxxxxxxxxx
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>>> >> >
>>> >
>>> >
>>
>>
>
_________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com