Re: How to calculate the nearfull ratio ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Loïc,

On Thu, May 4, 2017 at 8:30 AM Loic Dachary <loic@xxxxxxxxxxx> wrote:
Is there a way to calculate the optimum nearfull ratio for a given crushmap ?

This is a question that I was planning to cover in those calculations I was working on for python-crush. I've currently shelved the work for a few weeks but intend to look at it again as time frees up.

Basically, I see this as a five-fold uncertainty problem:
1. CRUSH mappings are pseudo-random and therefore (usually) uneven
2. Object distribution between placement groups has the exact same issue
3. Object size within a given pool can also vary greatly (from bytes to megabytes)
4. Failures and the following re-balancing are also random.
5. Finally, pools can occupy different and overlapping sets of OSDs, and hold independent sets of objects.

Thanks to your new CRUSH tools, I think #1 and #4 are solved respectively by the ability to:
- generate a CRUSH map for a precise (and even) distribution of PGs;
- test mappings for every scenario of N failures and find the worst-case scenario (very expensive calculation, but possible).

Issues #2 and #3 are more tricky. The big picture is that a given amount of data is placed more evenly the more objects there are, and there should be a way to use statistics to quantify that. Variance in object size then brings in more uncertainty, but I think that metric is difficult to quantify outside of very specific use cases where object size are known.

Finally, this might all be made redundant by the new auto-rebalancing feature that Sage is planning for Luminous. If we can assume even data placement at all times the #4 is the only thing we need to worry about. For performance-based placement that would be very different however. And if pools have overlapping OSD sets, that could be fairly tricky too.

Maybe some other users here already have some rule of thumb or actual calculations for that. I was planning to get into the statistical calculations of data placement assuming unique object size as the next step for the paper I am working on. Would there be a need for such tools?

Regards,
--
Xavier Villaneau
Storage Software Eng. at Concurrent Computer Corp.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux