Re: Is ceph production ready? [was: Ceph PG Incomplete = Cluster unusable]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nico,

I would probably recommend to upgrade to 0.87 (giant). I am running this version for some time now and it works very well. I also upgraded from firefly and it was easy.

The issue you are experiencing seems quite complex and it would require debug logs to troubleshoot.

Apology that I did not help much.

-Jiri

On 9/01/2015 20:23, Nico Schottelius wrote:
Good morning Jiri,

sure, let me catch up on this:

- Kernel 3.16
- ceph: 0.80.7
- fs: xfs
- os: debian (backports) (1x)/ubuntu (2x)

Cheers,

Nico

Jiri Kanicky [Fri, Jan 09, 2015 at 10:44:33AM +1100]:
Hi Nico.

If you are experiencing such issues it would be good if you provide more info about your deployment: ceph version, kernel versions, OS, filesystem btrfs/xfs.

Thx Jiri

----- Reply message -----
From: "Nico Schottelius" <nico-eph-users@xxxxxxxxxxxxxxx>
To: <ceph-users@xxxxxxxxxxxxxx>
Subject:  Is ceph production ready? [was: Ceph PG Incomplete = Cluster unusable]
Date: Wed, Dec 31, 2014 02:36

Good evening,

we also tried to rescue data *from* our old / broken pool by map'ing the
rbd devices, mounting them on a host and rsync'ing away as much as
possible.

However, after some time rsync got completly stuck and eventually the
host which mounted the rbd mapped devices decided to kernel panic at
which time we decided to drop the pool and go with a backup.

This story and the one of Christian makes me wonder:

Is anyone using ceph as a backend for qemu VM images in production?

And:

Has anyone on the list been able to recover from a pg incomplete /
stuck situation like ours?

Reading about the issues on the list here gives me the impression that
ceph as a software is stuck/incomplete and has not yet become ready
"clean" for production (sorry for the word joke).

Cheers,

Nico

Christian Eichelmann [Tue, Dec 30, 2014 at 12:17:23PM +0100]:
Hi Nico and all others who answered,

After some more trying to somehow get the pgs in a working state (I've
tried force_create_pg, which was putting then in creating state. But
that was obviously not true, since after rebooting one of the containing
osd's it went back to incomplete), I decided to save what can be saved.

I've created a new pool, created a new image there, mapped the old image
from the old pool and the new image from the new pool to a machine, to
copy data on posix level.

Unfortunately, formatting the image from the new pool hangs after some
time. So it seems that the new pool is suffering from the same problem
as the old pool. Which is totaly not understandable for me.

Right now, it seems like Ceph is giving me no options to either save
some of the still intact rbd volumes, or to create a new pool along the
old one to at least enable our clients to send data to ceph again.

To tell the truth, I guess that will result in the end of our ceph
project (running for already 9 Monthes).

Regards,
Christian

Am 29.12.2014 15:59, schrieb Nico Schottelius:
Hey Christian,

Christian Eichelmann [Mon, Dec 29, 2014 at 10:56:59AM +0100]:
[incomplete PG / RBD hanging, osd lost also not helping]
that is very interesting to hear, because we had a similar situation
with ceph 0.80.7 and had to re-create a pool, after I deleted 3 pg
directories to allow OSDs to start after the disk filled up completly.

So I am sorry not to being able to give you a good hint, but I am very
interested in seeing your problem solved, as it is a show stopper for
us, too. (*)

Cheers,

Nico

(*) We migrated from sheepdog to gluster to ceph and so far sheepdog
     seems to run much smoother. The first one is however not supported
     by opennebula directly, the second one not flexible enough to host
     our heterogeneous infrastructure (mixed disk sizes/amounts) - so we
     are using ceph at the moment.


--
Christian Eichelmann
Systemadministrator

1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
christian.eichelmann@xxxxxxxx

Amtsgericht Montabaur / HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren
--
New PGP key: 659B 0D91 E86E 7E24 FD15  69D0 C729 21A1 293F 2D24
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux