How to fix an incomplete PG on an 2 copy ceph-cluster?

Udo Lembke <ulembke@xxxxxxxxxxxx> · Sun, 16 Feb 2014 09:32:50 +0100

Hi,
I switch some disks from manual format to ceph-deploy (because slightly
different xfs-parameters) - all disks are on a single node of an 4-node
cluster.
After rebuilding the osd-disk one PG are incomplete:
ceph -s
    cluster 591db070-15c1-4c7a-b107-67717bdb87d9
     health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs
stuck unclean
     monmap e7: 3 mons at
{a=172.20.2.11:6789/0,b=172.20.2.64:6789/0,c=172.20.2.65:6789/0},
election epoch 1178, quorum 0,1,2 a,b,c
     mdsmap e409: 1/1/1 up {0=b=up:active}, 2 up:standby
     osdmap e22002: 52 osds: 52 up, 52 in
      pgmap v10177038: 7408 pgs, 5 pools, 58618 GB data, 14662 kobjects
            114 TB used, 76319 GB / 189 TB avail
                7405 active+clean
                   1 incomplete
                   2 active+clean+scrubbing+deep

The pg are on one of the "rebuilded" disk (osd.42):
ceph pg map 6.289
osdmap e22002 pg 6.289 (6.289) -> up [42,31] acting [42,31]

ls -lsa /var/lib/ceph/osd/ceph-42/current/6.289_head/
insgesamt 16
 0 drwxr-xr-x   2 root root     6 Feb 15 20:11 .
16 drwxr-xr-x 411 root root 12288 Feb 16 03:09 ..

ls -lsa
/var/lib/ceph/osd/ceph-31/current/6.289*/                                                                                                               

/var/lib/ceph/osd/ceph-31/current/6.289_head/:                                                                                                                                    

insgesamt
20520                                                                                                                                                                   

   8 drwxr-xr-x   2 root root    4096 Feb 15 10:24
.                                                                                                                              

  12 drwxr-xr-x 320 root root    8192 Feb 15 21:11
..                                                                                                                             

4100 -rw-r--r--   1 root root 4194304 Feb 15 10:24
benchmark\udata\uproxmox4\u638085\uobject2844__head_4F14E289__6
4100 -rw-r--r--   1 root root 4194304 Feb 15 10:24
benchmark\udata\uproxmox4\u638085\uobject3975__head_A7EBCA89__6
4100 -rw-r--r--   1 root root 4194304 Feb 15 10:24
benchmark\udata\uproxmox4\u638085\uobject4003__head_537FE289__6
4100 -rw-r--r--   1 root root 4194304 Feb 15 10:24
benchmark\udata\uproxmox4\u673679\uobject344__head_FF4A1289__6
4100 -rw-r--r--   1 root root 4194304 Feb 15 10:24
benchmark\udata\uproxmox4\u673679\uobject474__head_5FC3EA89__6

/var/lib/ceph/osd/ceph-31/current/6.289_TEMP/:
insgesamt 16
 4 drwxr-xr-x   2 root root    6 Feb 15 10:24 .
12 drwxr-xr-x 320 root root 8192 Feb 15 21:11 ..

How to say ceph, that the content on osd.31 is the right one?
I have tried an "ceph osd repair osd.42" without luck.

In the manual I saw only "ceph osd lost NN" but then all other data will
also rebuild to other disks I guess.
If "osd lost" the only option, how reuse osd-42? Waiting for an healthy
cluster and then recreate the disk?

Hope for an hint.

Best regards

Udo
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com