Hi Eneko, I was trying a rbd cp before, but that was haning as well. But I couldn't find out if the source image was causing the hang or the destination image. That's why I decided to try a posix copy. Our cluster is sill nearly empty (12TB / 867TB). But as far as I understood (If not, somebody please correct me) placement groups are in genereally not shared between pools at all. Regards, Christian Am 30.12.2014 12:23, schrieb Eneko Lacunza: > Hi Christian, > > Have you tried to migrate the disk from the old storage (pool) to the > new one? > > I think it should show the same problem, but I think it'd be a much > easier path to recover than the posix copy. > > How full is your storage? > > Maybe you can customize the crushmap, so that some OSDs are left in the > bad (default) pool, and other OSDs and set for the new pool. It think > (I'm yet learning ceph) that this will make different pgs for each pool, > also different OSDs, may be this way you can overcome the issue. > > Cheers > Eneko > > On 30/12/14 12:17, Christian Eichelmann wrote: >> Hi Nico and all others who answered, >> >> After some more trying to somehow get the pgs in a working state (I've >> tried force_create_pg, which was putting then in creating state. But >> that was obviously not true, since after rebooting one of the containing >> osd's it went back to incomplete), I decided to save what can be saved. >> >> I've created a new pool, created a new image there, mapped the old image >> from the old pool and the new image from the new pool to a machine, to >> copy data on posix level. >> >> Unfortunately, formatting the image from the new pool hangs after some >> time. So it seems that the new pool is suffering from the same problem >> as the old pool. Which is totaly not understandable for me. >> >> Right now, it seems like Ceph is giving me no options to either save >> some of the still intact rbd volumes, or to create a new pool along the >> old one to at least enable our clients to send data to ceph again. >> >> To tell the truth, I guess that will result in the end of our ceph >> project (running for already 9 Monthes). >> >> Regards, >> Christian >> >> Am 29.12.2014 15:59, schrieb Nico Schottelius: >>> Hey Christian, >>> >>> Christian Eichelmann [Mon, Dec 29, 2014 at 10:56:59AM +0100]: >>>> [incomplete PG / RBD hanging, osd lost also not helping] >>> that is very interesting to hear, because we had a similar situation >>> with ceph 0.80.7 and had to re-create a pool, after I deleted 3 pg >>> directories to allow OSDs to start after the disk filled up completly. >>> >>> So I am sorry not to being able to give you a good hint, but I am very >>> interested in seeing your problem solved, as it is a show stopper for >>> us, too. (*) >>> >>> Cheers, >>> >>> Nico >>> >>> (*) We migrated from sheepdog to gluster to ceph and so far sheepdog >>> seems to run much smoother. The first one is however not supported >>> by opennebula directly, the second one not flexible enough to host >>> our heterogeneous infrastructure (mixed disk sizes/amounts) - so we >>> are using ceph at the moment. >>> >> > > -- Christian Eichelmann Systemadministrator 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelmann@xxxxxxxx Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com