sorry, 100% rw On Thu, May 19, 2016 at 1:34 PM, Haomai Wang <haomai@xxxxxxxx> wrote: > 100%, because of it's a small rbd image. all metadata should be cached. > > On Thu, May 19, 2016 at 10:39 AM, Jianjian Huo <jianjian.huo@xxxxxxxxxxx> wrote: >> >> On Wed, May 18, 2016 at 6:35 PM, Haomai Wang <haomai@xxxxxxxx> wrote: >>> On Thu, May 19, 2016 at 7:50 AM, Jianjian Huo <jianjian.huo@xxxxxxxxxxx> wrote: >>>> >>>> On Wed, May 18, 2016 at 10:19 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >>>>> >>>>> Hi Rajath! >>>>> >>>>> Great to hear you're interested in working with us outside of GSoC! >>>>> >>>>> On Wed, 18 May 2016, Haomai Wang wrote: >>>>> > Hi Rajath, >>>>> > >>>>> > We are glad to see your passion, from my view, sage is planning to >>>>> > implement a userspace cache in bluestore itself. Like >>>>> > (https://github.com/ceph/ceph/commit/b9ac31afe5a162176f019baa25348014a77f6ab8#commitcomment-17488250). >>>>> > >>>>> > I guess the cache won't be a generic cache interface. Instead it will >>>>> > be bound to specified needed object. So sage may give a brief? >>>>> >>>>> Part of the reason why this project wasn't at the top of our list (we got >>>>> fewer slots than we had projects) was because the BlueStore code is in >>>>> flux and moving quite quickly. For the BlueStore side, we are building a >>>>> simple buffer cache that is tied to an Onode (in-memory per-object >>>>> metadata structure) and integrated tightly with the read and write IO >>>>> paths. This will eliminate our use of the block device buffer cache for >>>>> user/object data. >>>>> >>>>> The other half of the picture, though, is the BlueFS layer that is >>>>> consumed by rocksdb: it also needs caching in order for rocksdb to perform >>>>> at all. My hope is that the code we write for the use data can be re-used >>>>> here as well, but it is still evolving. >>>> >>>> When Bluestore moves away from kernel cache to its own buffer cache, RocksDB can use its own buffer cache as well. >>>> RocksDB has this size configurable block cache to cache uncompressed data blocks, it can serve as buffer cache, >>>> since Bluestore don't compress meta data in RocksDB. >>> >>> Actually this is not behaviored as expected. From my last nvmedevice >>> benchmark, lots of read still go down device instead of caching by >>> rocksdb when I set a very large block cache. I guess there exists some >>> gaps between our usages and rocksdb implementation >> >> What kind of workload did you use for that benchmarking, 100% read? >> >>> >>>> >>>> Jianjian >>>>> >>>>> The main missing piece I'd say is a way to string Buffer objects together >>>>> in a global(ish) LRU (or set of LRUs, or whatever we need for the caching >>>>> policy that makes sense) so that trimming can be done safely and >>>>> efficiently. Right now the code is lock-free because each Onode is only >>>>> touched under the collection rwlock, but in order to do trimming we need >>>>> to be able to reap cold buffers from a global context. >>>>> >>>>> Anyway, there is no clear or ready answer here yet, but we are ready to >>>>> discuss design/approach here on the list, and welcome your input (and >>>>> potentially, contributions to development!). >>>>> >>>>> sage >>>>> >>>>> >>>>> > >>>>> > On Wed, May 18, 2016 at 9:32 PM, Haomai Wang <haomai@xxxxxxxx> wrote: >>>>> > > >>>>> > > >>>>> > > On Wed, May 18, 2016 at 2:44 PM, Rajath Shashidhara >>>>> > > <rajath.shashidhara@xxxxxxxxx> wrote: >>>>> > >> >>>>> > >> Hello, >>>>> > >> >>>>> > >> I was a GSoC'16 applicant for the project "Implementing Cache layer on >>>>> > >> top of NVME Driver". Unfortunately, I was not selected for the >>>>> > >> internship. >>>>> > > >>>>> > > >>>>> > > Hi Rajath, >>>>> > > >>>>> > > We are glad to see your passion, from my view, sage is planning to implement >>>>> > > a userspace cache in bluestore itself. Like >>>>> > > (https://github.com/ceph/ceph/commit/b9ac31afe5a162176f019baa25348014a77f6ab8#commitcomment-17488250). >>>>> > > >>>>> > > I guess the cache won't be a generic cache interface. Instead it will be >>>>> > > bound to specified needed object. So sage may give a brief? >>>>> > > >>>>> > >> >>>>> > >> >>>>> > >> However, I would be interested in working on the project as an >>>>> > >> independent contributor to Ceph. >>>>> > >> I am expecting to receive the necessary support from Ceph developer >>>>> > >> community. >>>>> > >> >>>>> > >> In case I missed out any important details in my project proposal or I >>>>> > >> have the wrong understanding of the project, please help me figure out >>>>> > >> the details. >>>>> > >> >>>>> > >> Looking forward to contribute! >>>>> > >> >>>>> > >> Thank you, >>>>> > >> Rajath Shashidhara >>>>> > > >>>>> > > >>>>> > -- >>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> > the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> > >>>>> > >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html