Re: CEPH over SW-RAID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, but with SW-RAID, when we have a block that was read and does not match its checksum, the device falls out of the array, and the data is read again from the other devices in the array. The problem is that in SW-RAID1 we don't have the badblocks isolated. The disks can be sincronized again as the write operation is not tested. The problem (device falling out of the array) will happen again if we try to read any other data written over the bad block.

My new question regarding Ceph is if it isolates this bad sectors where it found bad data when scrubbing? or there will be always a replica of something over a known bad block..?

I also saw that Ceph use same metrics when capturing data from disks. When the disk is resetting or have problems, its metrics are going to be bad and the cluster will rank bad this osd. But I didn't saw any way of sending alerts or anything like that. SW-RAID has its mdadm monitor that alerts when things go bad. Should I have to be looking for ceph logs all the time to see when things go bad?

Thanks.
Jose Tavares

On Mon, Nov 23, 2015 at 3:19 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Most people run their clusters with no RAID for the data disks (some
will run RAID for the journals, but we don't). We use the scrub
mechanism to find data inconsistency and we use three copies to do
RAID over host/racks, etc. Unless you have a specific need, it is best
to forgo the Linux SW RAID or even HW RAIDs too with Ceph.
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Nov 23, 2015 at 10:09 AM, Jose Tavares  wrote:
> Hi guys ...
>
> Is there any advantage in running CEPH over a Linux SW-RAID to avoid data
> corruption due to disk bad blocks?
>
> Can we just rely on the scrubbing feature of CEPH? Can we live without an
> underlying layer that avoids hardware problems to be passed to CEPH?
>
> I have a setup where I put one OSD per node and I have a 2 disk raid-1
> setup. Is it a good option or it would be better if I had 2 OSDs, one in
> each disk? If I had one OSD per disk, I would have to increase the number os
> replicas to guarantee enough replicas if one node goes down.
>
> Thanks a lot.
> Jose Tavares
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.2.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWU0qBCRDmVDuy+mK58QAAczAP/RducnXBNyeESCwUP/RC
3ELmoZxMO2ymrcQoutUVXfPTZk7f9pINUux4NRnglbVDxHasmNBHFKV3uWTS
OBmaVuC99cwG/ekhmNaW9qmQIZiP8byijoDln26eqarhhuMECgbYxZhLtB9M
A1W5gpKEvCBvYcjW9V/rwb0+V678Eo1IVlezwJ1TP3pxvRWpDsg1dIhOBit8
PznnPTMS46RGFrFirTg1AfvmipSI3rhLFdR2g7xHrQs9UHdmC0OQ/Jcjnln+
L0LNni7ht1lK80J9Mk4Q/nt7HfWCxJrg497Q+R0m+ab3qFJWBUGwofjbEnut
JroMLph0sxAzmDSst8a15pzTYaIqMqKkGfGeHgiaNzePwELAY2AKwgx2AIlf
iYJCtyiXRHnfQfQEi1TflWFuEaaAhKCPqRO7Duf6a+rEsSkvViaZ9Mtm1bSX
KnLLSz8ZtXI4wTWbImXbpdhuGgHvKsEGWlU+YDuCil9i+PedM67us1Y6TAsT
UWvCd8P385psITLI37Ly+YDHphjyeyYljCPGuom1e+/J3flElS/BgWUGUibB
rA3QUNUIPWKO6F37JEDja13BShTE9I17Y3EpSgGGG3jnTt93/E4dEvR6mC/F
qPPjs7EMvc99Xi7rTqtpm58JLGXWh3rMgjITJTwfLhGtCHgSvvrsRjmGB9Xa
anPK
=XQGP
-----END PGP SIGNATURE-----

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux