On Thu, 14 May 2015, Srikanth Madugundi wrote: > Adding ceph-devel > Sage, > > Do you have a timeline to support the feature to store multiple objects into > one fragment? No timeline for this... it isn't something I've thought about at all. It'd probably be multiple objects in one fid, though--the fragments would be non-overlapping. Unless we wanted to do some deduping thing with shared fragments (where, say, you write the same content to multiple objects simultaneously). sage > > Regards > Srikanth > > On Thu, May 14, 2015 at 12:27 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: > On Thu, 14 May 2015, Srikanth Madugundi wrote: > > Thanks Sage for the details, > > As per your email mentioned in the newstore blueprint. > > > > http://marc.info/?l=ceph-devel&m=142438985013041&w=2 > > can we start ccing ceph-devel? > > > You have defined the following structures which has definition > to offset in > > the fragment and length of the fragment. Just wondering how > this is used. If > > each object is stored in one file why do we need offset in the > file and > > length. > > > > struct fragment_t { > > uint32_t offset; ///< offset in file to first byte of > this fragment > > uint32_t length; ///< length of fragment/extent > > Right now there's only 1 fragment per object, so offset is > always == 0. > That will eventually change (and read path can already handle > it). > > > fid_t fid; ///< file backing this fragment > > > > and fid_t is > > > > struct fid_t { > > uint32_t fset, fno; // identify the file name: > fragments/%d/%d > > Yep > > sage > > > > > > > Regards > > Srikanth > > > > > > On Thu, May 14, 2015 at 9:11 AM, Sage Weil <sweil@xxxxxxxxxx> > wrote: > > On Thu, 14 May 2015, Srikanth Madugundi wrote: > > > Hi Sage, > > > I setup a test cluster with object store as "newstore > > rocksdb", I noticed > > > that the number of files creates are equal to the > number of > > objects for > > > objects of size 100K. Is this expected? As per the > blueprint > > document you > > > mentioned that multiple objects are packed into one > file under > > > fragments/%d/%d. > > > > It's expected.. I dont' remember talking about packing > multiple > > object > > sinto a single fragmetn. We *do* plan to put a single > object > > across > > multiple fragments if it is big (to avoid > cost/complexity of > > write-ahead for overwrite on large writes), but that > isn't > > implemented yet. > > > > > I ran another test with 10K size objects and noticed > that the > > objects are > > > stored under db/*.sst files. The objects are not > created under > > fragments/ > > > directory for objects of size 10K. > > > > db/ is rocksdb. Small objects are exclusively there (by > > default, you > > can disable this with the newstore overlay settings). > > > > > I plan to read through the newstore code but wanted to > check > > with you what > > > is the object size when newstore starts to aggregate > multiple > > objects into > > > single file? > > > > s > > > > > > > > Regards > > > Srikanth > > > > > > > > > > > > > > > > > > > > > > > >