Re: Will the number of objects that have ever existed be infinite?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Li!

On Sat, 23 May 2015, ??? wrote:
> Hello!
> 
> I'm a GSoC student this year and my job is to introduce Missing Rate
> Curve (or reuse distance exactly) of objects into OSD. Now I'm trying
> to find a proper algorithm to implement but there is a problem: Should
> I take the number of objects tracked in an OSD as infinite or
> constant?
> 
> The point is that there is an algorithm that use hash to sample only
> constant number of references to do the analysis and is proved to be
> accurate, which makes it possible to do online MRC construction. That
> accuracy is supported by the fact that the memory addresses is
> bounded, while objects can be deleted and created again and again in
> Ceph. Is is reasonable to think that an OSD only serves bounded number
> of objects in its life time (or the time period that we want to
> compute MRC)?

I don't remember how the object count affects the MRC, but I suspect we 
will want to use a strategy similar to what the HitSets do:

 - a new HitSet is generated on a periodic basis
 - each time a new one is started, we size it based on the previous 
iteration: we can compare the number of HitSet (bloom filter) insertions 
we've done with the resulting filter density.

I think we'll want to build periodic MRCs anyway since the workload will 
shift over time.  Ceph explicitly tracks the number of objects within each 
PG (see pg_stats_t).

Does that help?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux