Thank you very much for your reply. In the literature I reviewed the lack of large-scale traces to create synthetic workloads like in [1] is highlighted. I will keep an eye on these topics for the future. I think the "End-to-end performance visualization" idea for this year's GSoC would be a good start for me. [1] http://web.stanford.edu/~cdel/Storage_I-O_google_poster.pdf On Fri, Mar 11, 2016 at 4:28 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > Hi Victor, > > On Thu, 10 Mar 2016, Victor Araujo wrote: >> Hello, >> >> I am a first year master's student in Finland. I recently discovered >> Ceph and have been getting acquainted with the concepts and the >> codebase. >> >> I noticed the blueprint >> http://tracker.ceph.com/projects/ceph/wiki/Osd_-_Tiering_II_(Warm-%3ECold) >> and I was wondering if there is a particular piece of it that would >> be suitable for this year's GSoC. > > The 'tiering ii' is the direction we want to go in. We need to be careful > about designing this interface, though, and the support in the OSD to make > use of it is delicate. It's an important effort but it is probably not > the best candidate for a short GSoC project. Perhaps we can figure out a > smaller piece of this that is appropriately scoped... > >> I also noticed several other blueprints related to cache tiering, like >> http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_tiering_eviction >> suggesting improvements or other implementations like ARC. Would it >> make sense to enable some sort of cache plugins, similar to what was >> done with erasure code plugins, to allow different strategies for >> particular use cases and facilitate testing other implementations? If >> it makes sense and it could be scoped for GSoC I'd be very interested. > > There are several ideas floating around for how to improve the existing > cache tier code, but most of them are effectively tabled because we don't > have a good sense of whether they will help or not. > > Perhaps the most valuable project for the cache tiering or tiering in > general would be to create a set of tools and best practices for > simulating realistic workloads and better quantifying the performance of > tiering. Tests are hard to set up: you need reasonably sized base and > cache tiers, cache tiers OSDs that are faster than base tier devices, > workloads with realistic object temperature variation, tests that run long > enough to warm up the cache, metrics that let us understand what the > behavior is and what is working right and not right. It's a lot of > most-not-coding work, but without it it's hard to tell if any change we > make is making things better or worse. > > Does anybody else have ideas here? > > sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html