Re: domino-style OSD crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 05/07/2012 23:32, Gregory Farnum a écrit :

[...]
ok, so as all nodes were identical, I probably have hit a btrfs bug (like a
erroneous out of space ) in more or less the same time. And when 1 osd was
out,

OH , I didn't finish the sentence... When 1 osd was out, missing data was copied on another nodes, probably speeding btrfs problem on those nodes (I suspect erroneous out of space conditions)

I've reformatted OSD with xfs. Performance is slightly worse for the moment (well, depend on the workload, and maybe lack of syncfs is to blame), but at least I hope to have the storage layer rock-solid. BTW, I've managed to keep the faulty btrfs volumes .

[...]

I wonder if maybe there's a confounding factor here — are all your nodes
similar to each other,
Yes. I designed the cluster that way. All nodes are identical hardware
(powerEdge M610, 10G intel ethernet + emulex fibre channel attached to
storage (1 Array for 2 OSD nodes, 1 controller dedicated for each OSD)
Oh, interesting. Are the broken nodes all on the same set of arrays?

No. There are 4 completely independant raid arrays, in 4 different locations. They are similar (same brand & model, but slighltly different disks, and 1 different firmware), all arrays are multipathed. I don't think the raid array is the problem. We use those particular models since 2/3 years, and in the logs I don't see any problem that can be caused by the storage itself (like scsi or multipath errors)

Cheers,

--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux