Hi Greg, I have used the ultimative way with ceph osd lost 42 --yes-i-really-mean-it but the pg is further down: ceph -s cluster 591db070-15c1-4c7a-b107-67717bdb87d9 health HEALTH_WARN 206 pgs degraded; 1 pgs down; 57 pgs incomplete; 1 pgs peering; 31 pgs stuck inactive; 145 pgs stuck unclean; recovery 527486/30036784 objects degraded (1.756%); 1/52 in osds are down monmap e7: 3 mons at {a=172.20.2.11:6789/0,b=172.20.2.64:6789/0,c=172.20.2.65:6789/0}, election epoch 1178, quorum 0,1,2 a,b,c mdsmap e409: 1/1/1 up {0=b=up:active}, 2 up:standby osdmap e22281: 52 osds: 51 up, 52 in pgmap v10321809: 7408 pgs, 5 pools, 58634 GB data, 14666 kobjects 114 TB used, 76285 GB / 189 TB avail 527486/30036784 objects degraded (1.756%) 7144 active+clean 1 down+peering 206 active+degraded 57 incomplete client io 60506 B/s wr, 6 op/s The pg-content are on osd-31: ceph pg map 6.289 osdmap e22281 pg 6.289 (6.289) -> up [31] acting [31] But an hour later the old state was rebuild: ceph pg map 6.289 osdmap e22312 pg 6.289 (6.289) -> up [44,31] acting [44,31] ls -lsa /var/lib/ceph/osd/ceph-44/current/6.289_head/ insgesamt 32 0 drwxr-xr-x 2 root root 6 Feb 17 21:37 . 32 drwxr-xr-x 515 root root 16384 Feb 18 08:23 .. How to remove/clean the PG?? The content (benchmark-files) is not neccessary anymore. But the ugly thing is, how this can happens - in the pg are no writes during the first stop of the osd!? I think the only way this conditions can appear is this scenario: 1. disk X on node 4 was recreated, so the cluster is in degraded state. 2. write happens to pg 6.289 on osd-42 and due the switch "osd_pool_default_min_size = 1" the acknowledge to the client was sent after a write on osd-42, before the write on node 2, osd-31 happens. 3. osd-42 was stopped and also reformatted and rebuildet (before the write to pg 6.289 on osd-31 was done). But there are two inconsenstensies: First the acknowledge after only writen on one disk should only accour if the second disk is down - in this case both disks are up. Second: There are no writes during this time - I use the cluster only for VMs (from proxmox-ve) and the disks are still created - so the writes should only to existings pgs like: 2.551_head/DIR_1/DIR_5/DIR_5/DIR_0/rbd\udata.89cef2ae8944a.000000000016936a__head_60480551__2 Are there an wrong assumtion from me? Any comments how to remove the incomplete pg? Udo Am 16.02.2014 18:48, schrieb Gregory Farnum: > Check out http://ceph.com/docs/master/rados/operations/placement-groups/#get-statistics-for-stuck-pgs > and http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/. > What does the dump of the PG say is going on? > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com