Re: pgs unfound

Xabier Elkano <xelkano@xxxxxxxxxxxx> · Thu, 1 Dec 2016 15:57:38 +0100

Hi,

I managed to remove the warning  reweighting the  crashed OSD:

ceph osd crush reweight osd.33 0.8

After the recovery, the cluster is not showing the warning any more

Xabier

On 29/11/16 11:18, Xabier Elkano wrote:
> Hi all,
>
> my cluster is in WARN state because apparently there are some pgs
> unfound. I think that I reached this situation because the metadata
> pool, this pool was in default root but without any use because I don't
> use cephfs, I only use rbd for VMs. I don't have OSDs in the default
> root, they are assigned to different roots depending on its disk type.
>
> My pools have specific crush rules to use the different roots. Again,
> all my pools, but not the metadata pool, it was assigned to the default
> root.
> With this situation I got problems in one OSD (I had to reset it from
> scratch) and when I restored the situation I got some pgs unfound
> because they where in the faulty OSD and belonged to the  metadata pool
> with size 1. Because I didn't care of any data in the metadata pool I
> created the unfounds pgs again with "ceph pg force_create_pg 1.25".
> Finally I set a crush rule to the metadata pool to change its location
> and the pgs were created.
>
> But now, the cluster is showing 29 unfound pgs, but without saying what pgs.
> How can I recover from this situation? Can I remove metadata pool and
> recreate it again?
>
>
> # ceph status
>     cluster 72a4a18b-ec5c-454d-9135-04362c97c307
>      health HEALTH_WARN
>             recovery 29/2748828 unfound (0.001%)
>      monmap e13: 5 mons at
> {mon1=172.16.64.12:6789/0,mon2=172.16.64.13:6789/0,mon3=172.16.64.16:6789/0,mon4=172.16.64.30:6789/0,mon4=172.16.64.31:6789/0}
>             election epoch 99672, quorum 0,1,2,3,4 mon1,mon2,mon2,mon3,mon4
>      mdsmap e35323: 0/0/1 up
>      osdmap e49648: 38 osds: 38 up, 38 in
>       pgmap v76150847: 3065 pgs, 21 pools, 10654 GB data, 2684 kobjects
>             31111 GB used, 25423 GB / 56534 GB avail
>             29/2748828 unfound (0.001%)
>                 3063 active+clean
>                    2 active+clean+scrubbing
>   client io 4431 kB/s rd, 15897 kB/s wr, 2385 op/s
>
>
> # ceph health detail
> HEALTH_WARN recovery 29/2748829 unfound (0.001%)
> recovery 29/2748829 unfound (0.001%)
>
>
> My cluster runs hammer 0.94.9
> 5 Servers with 7 OSDs each on Ubuntu 14.04
> 5 monitor servers.
>
> Thanks and Best regards,
> Xabier
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com