Re: Ceph version 0.56.1, data loss on power failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 16/01/2013 17:56, Jeff Mitchell a écrit :
FWIW, my ceph data dirs (for e.g. mons) are all on XFS. I've
experienced a lot of corruption on these on power loss to the node --
and in some cases even when power wasn't lost, and the box was simply
rebooted. This is on Ubuntu 12.04 with the ceph-provied 3.6.3 kernel
(as I'm using RBD on these).

It's pretty much to the point where I'm thinking of changing them all
over to ext4 for these data dirs, as the hassle of rebuilding mons
constantly is just not worth the trouble.
In october, I've lost a complete ceph cluster, because of a combination of
a memory management bug in kernel 3.6 + a bug in XFS (another BUG) (I Had 12 Nodes, replication was at 2, 5/6 machines were crashed in a row, because of mm bug, and 2 ended with unrecoverable corruption)

so, 150 TB of data on the cluster were unrecoverable. Hopefully it was only test data.

if you want the gory details see here :

http://oss.sgi.com/archives/xfs/2012-10/msg00420.html

This XFS bug was corrected in 3.0.52, 3.2.34,3.4.19,3.6.7. Dave chinner was very quick to fix the problem.

Add the last bug, (journal not flushed properly), not yet fixed on latest kernels.... I can understand your reaction...

But, believe it or not, I'm still confident with XFS. I've been using it for more than 10 years on TB and TB of data, and apart those recents problems , XFS have been extremely good (stability, performance, crash tolerance) all this time.

Not saying ext4 isn't good, but if you follow kernel developpement, you'll see that it's not bug-free either...

And not speaking of btrfs which was totally unstable with ceph on my last tries (6 month ago)

In fact, ceph is hammering hardware strongly, so it's very good to find bugs in linux kernel :)


So, for the moment, i'm sticking with 3.4.25 kernel. Longterm kernel, proven, stable : no mm problems, no xfs problems.


Cheers,

--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux