Re: some newbie questions...

Dzianis Kahanovich <mahatma@xxxxxxxxxxxxxx> · Mon, 02 Sep 2013 15:48:19 +0300

Oliver Daudey пишет:

>>>>> 1) i read somewhere that it is recommended to have one OSD per disk in a production environment.
>>>>>     is this also the maximum disk per OSD or could i use multiple disks per OSD? and why?
>>>>
>>>> you could use multiple disks for one OSD if you used some striping and abstract the disk (like LVM, MDRAID, etc). But it wouldn't make sense. One OSD writes into one filesystem, that is usually one disk in a production environment. Using RAID under it wouldn't increase neither reliability nor performance drastically.
>>>
>>> I see some sense in RAID 0: single ceph-osd daemon per node (but still
>>> disk-per-osd self). But if you have relative few [planned] cores per task on
>>> node - you can think about it.
>>
>> Raid-0: single disk failure kills the entire filesystem, off-lines the 
>> osd and triggers a cluster-wide resync. Actual raid: single disk failure 
>> does not affect the cluster in any way.
> 
> RAID-controllers also add a lot of manageability into the mix.  The fact
> that a chassis starts beeping and indicates exactly which disk needs
> replacing, managing automatic rebuild after replacement, makes
> operations much easier, even by less technical personnel.  Also, if you
> have fast disks and a good RAID-controller, it should offload the entire
> rebuild-process from the node's main CPU without a performance-hit on
> the Ceph-cluster or node.  As already said, OSDs are expensive on the
> resources, too.  Having too many of them on one node and then having an
> entire node fail, can cause a lot of traffic and load on the remaining
> nodes while things rebalance.

Oh, no! Raid controller bounds to special hardware and|or his limitations.
Example: I have 3 nodes, 2 with SATA, 1 - LSI Megaraid SAS. SAS have 1 profit:
large number of disks (I have 6x1Tb OSDs on SAS and 3x2Tb OSDs per SATA), but
many troubles: cannot hot replace (fixme about Megaraid?), cannot read
RAID-formatted on other 2 nodes... You speak about GOOD controller - so, yes -
good is good. But for Ceph I see 2 reason of special controller: possible better
speed and battery-backed cache. All other jobs (striping, fault tolerance) are
Ceph's. Better to buy many biggest possible disks and insert it into many usual
SATA machines.

And usually I kill hardware RAID on new machines and start mdadm (if there are
single-node Linux server) - to avoid painful games with various hardware.

-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com