Sorry this is delayed, catching up. I beleive this was talked about in the last Ceph summit. I think this was the blueprint. https://wiki.ceph.com/Planning/Blueprints/Hammer/Towards_Ceph_Cold_Storage On Wed, Jan 14, 2015 at 9:35 AM, Martin Millnert <martin@xxxxxxxxxxx> wrote: > Hello list, > > I'm currently trying to understand what I can do with Ceph to optimize > it for a cold-storage (write-once, read-very-rarely) like scenario, > trying to compare cost against LTO-6 tape. > > There is a single main objective: > - minimal cost/GB/month of operations (including power, DC) > > To achieve this, I can break it down to: > - Use best cost/GB HDD > * SMR today > - Minimal cost/3.5"-slot > - Minimal power-utilization/drive > > While staying within what is available today, I don't imagine going to > power-down individual disk slots using IPMI etc, as some vendors allow. > > Now, putting Ceph on this, drives will be on, but it would be very > useful to be able to spin-down drives that aren't used. > > It then seems to me that I want to do a few things with Ceph: > - Have only a subest of the cluster 'active' for writes at any point in > time > - Yet, still have the entire cluster online and available for reads > - Minimize concurrent OSD operations in a node that uses RAM, e.g. > - Scrubbing, minimal number of OSDs active (ideally max 1) > - In general, minimize concurrent "active" OSDs as per above > - Minimize risk that any type of re-balancing of data occurs at all > - E.g. use a "high" number of EC parity chunks > > > Assuming e.g. 16 drives/host and 10TB drives, at ~100MB/s read and > nearly full cluster, deep scrubbing the host would take 18.5 days. > This means roughly 2 deep scrubs per month. > Using EC pool, I wouldn't be very worried about errors, so perhaps > that's ok (calculable), but they need to be repaired obviously. > Mathematically, I can use an increase of parity chunks to lengthen the > interval between deep scrubs. > > > Is there anyone on the list who can provide some thoughts on the > higher-order goal of "Minimizing concurrently active OSDs in a node"? > > I imagine I need to steer writes towards a subset of the system - but I > have no idea how to implement it - using multiple separate clusters eg. > each OSD on a node participate in unique clusters could perhaps help. > > Any feedback appreciated. It does appear a hot topic (pun intended). > > Best, > Martin > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com