Check out http://ceph.com/docs/master/rados/operations/placement-groups/#get-statistics-for-stuck-pgs and http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/. What does the dump of the PG say is going on? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Feb 16, 2014 at 12:32 AM, Udo Lembke <ulembke@xxxxxxxxxxxx> wrote: > Hi, > I switch some disks from manual format to ceph-deploy (because slightly > different xfs-parameters) - all disks are on a single node of an 4-node > cluster. > After rebuilding the osd-disk one PG are incomplete: > ceph -s > cluster 591db070-15c1-4c7a-b107-67717bdb87d9 > health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 1 pgs > stuck unclean > monmap e7: 3 mons at > {a=172.20.2.11:6789/0,b=172.20.2.64:6789/0,c=172.20.2.65:6789/0}, > election epoch 1178, quorum 0,1,2 a,b,c > mdsmap e409: 1/1/1 up {0=b=up:active}, 2 up:standby > osdmap e22002: 52 osds: 52 up, 52 in > pgmap v10177038: 7408 pgs, 5 pools, 58618 GB data, 14662 kobjects > 114 TB used, 76319 GB / 189 TB avail > 7405 active+clean > 1 incomplete > 2 active+clean+scrubbing+deep > > The pg are on one of the "rebuilded" disk (osd.42): > ceph pg map 6.289 > osdmap e22002 pg 6.289 (6.289) -> up [42,31] acting [42,31] > > ls -lsa /var/lib/ceph/osd/ceph-42/current/6.289_head/ > insgesamt 16 > 0 drwxr-xr-x 2 root root 6 Feb 15 20:11 . > 16 drwxr-xr-x 411 root root 12288 Feb 16 03:09 .. > > ls -lsa > /var/lib/ceph/osd/ceph-31/current/6.289*/ > > /var/lib/ceph/osd/ceph-31/current/6.289_head/: > > insgesamt > 20520 > > 8 drwxr-xr-x 2 root root 4096 Feb 15 10:24 > . > > 12 drwxr-xr-x 320 root root 8192 Feb 15 21:11 > .. > > 4100 -rw-r--r-- 1 root root 4194304 Feb 15 10:24 > benchmark\udata\uproxmox4\u638085\uobject2844__head_4F14E289__6 > 4100 -rw-r--r-- 1 root root 4194304 Feb 15 10:24 > benchmark\udata\uproxmox4\u638085\uobject3975__head_A7EBCA89__6 > 4100 -rw-r--r-- 1 root root 4194304 Feb 15 10:24 > benchmark\udata\uproxmox4\u638085\uobject4003__head_537FE289__6 > 4100 -rw-r--r-- 1 root root 4194304 Feb 15 10:24 > benchmark\udata\uproxmox4\u673679\uobject344__head_FF4A1289__6 > 4100 -rw-r--r-- 1 root root 4194304 Feb 15 10:24 > benchmark\udata\uproxmox4\u673679\uobject474__head_5FC3EA89__6 > > /var/lib/ceph/osd/ceph-31/current/6.289_TEMP/: > insgesamt 16 > 4 drwxr-xr-x 2 root root 6 Feb 15 10:24 . > 12 drwxr-xr-x 320 root root 8192 Feb 15 21:11 .. > > How to say ceph, that the content on osd.31 is the right one? > I have tried an "ceph osd repair osd.42" without luck. > > In the manual I saw only "ceph osd lost NN" but then all other data will > also rebuild to other disks I guess. > If "osd lost" the only option, how reuse osd-42? Waiting for an healthy > cluster and then recreate the disk? > > Hope for an hint. > > > Best regards > > Udo > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com