Re: cold-storage tuning Ceph

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Mon, 23 Feb 2015 16:47:08 -0700

Sorry this is delayed, catching up. I beleive this was talked about in
the last Ceph summit. I think this was the blueprint.
https://wiki.ceph.com/Planning/Blueprints/Hammer/Towards_Ceph_Cold_Storage

On Wed, Jan 14, 2015 at 9:35 AM, Martin Millnert <martin@xxxxxxxxxxx> wrote:
> Hello list,
>
> I'm currently trying to understand what I can do with Ceph to optimize
> it for a cold-storage (write-once, read-very-rarely) like scenario,
> trying to compare cost against LTO-6 tape.
>
> There is a single main objective:
>  - minimal cost/GB/month of operations (including power, DC)
>
> To achieve this, I can break it down to:
>  - Use best cost/GB HDD
>    * SMR today
>  - Minimal cost/3.5"-slot
>  - Minimal power-utilization/drive
>
> While staying within what is available today, I don't imagine going to
> power-down individual disk slots using IPMI etc, as some vendors allow.
>
> Now, putting Ceph on this, drives will be on, but it would be very
> useful to be able to spin-down drives that aren't used.
>
> It then seems to me that I want to do a few things with Ceph:
>  - Have only a subest of the cluster 'active' for writes at any point in
>    time
>  - Yet, still have the entire cluster online and available for reads
>  - Minimize concurrent OSD operations in a node that uses RAM, e.g.
>    - Scrubbing, minimal number of OSDs active (ideally max 1)
>    - In general, minimize concurrent "active" OSDs as per above
>  - Minimize risk that any type of re-balancing of data occurs at all
>    - E.g. use a "high" number of EC parity chunks
>
>
> Assuming e.g. 16 drives/host and 10TB drives, at ~100MB/s read and
> nearly full cluster, deep scrubbing the host would take 18.5 days.
> This means roughly 2 deep scrubs per month.
> Using EC pool, I wouldn't be very worried about errors, so perhaps
> that's ok (calculable), but they need to be repaired obviously.
> Mathematically, I can use an increase of parity chunks to lengthen the
> interval between deep scrubs.
>
>
> Is there anyone on the list who can provide some thoughts on the
> higher-order goal of "Minimizing concurrently active OSDs in a node"?
>
> I imagine I need to steer writes towards a subset of the system - but I
> have no idea how to implement it - using multiple separate clusters eg.
> each OSD on a node participate in unique clusters could perhaps help.
>
> Any feedback appreciated.  It does appear a hot topic (pun intended).
>
> Best,
> Martin
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com