On Wed, 15 Jun 2016, Sage Weil wrote: > On Tue, 14 Jun 2016, kefu chai wrote: > > On Thu, Mar 3, 2016 at 3:15 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > > Latest branch: > > > > > > https://github.com/liewegas/ceph/commits/wip-reweight > > > > > > I made several changes: > > > > > > - new command, 'osd utilization' > > > > > > avg 51 > > > stddev 0.707107 (expected baseline 6.68019) > > > > Sage, sorry for bringing this up again. I don't quite understand what > > the "expected baseline" is for. per the code, > > > > float edev = sqrt(avg_pg * (1.0 - (1.0 / (double)num_up_in))); > > > > so it sounds sound sort of std dev, but the unit of std dev here is > > the "number of pg", while edev's unit is > > num_pg^{1/2}, so it's not comparable with avg_pg or the > > {base,new}_stddev. so could you shed some light on how we shall use > > this number as a reference? > > and what's the meaning of this number? > > (Re-reading 4.1 from the CRUSH paper... I always forget the math) > > The average osd utilization for the binomial placement process is mu = n*p > (p=1/num_osds, n=num pgs) = num_pgs/num_osds, and the standard deviation > is sqrt(n*p*(1-p)) = sqrt(avg_pgs*(1-p)) = sqrt(avg_pgs*(1-(1/num_osds))). > > The units for standard deviation are weird, yep. > > In any case, the baseline is there to tell you what the statistically > expected std dev is so you can tell if something is awry... ...and reading the message you were quoting it looks like this value > > > stddev 0.707107 (expected baseline 6.68019) is wrong. I'm not sure where it came from, though. I checked the output on a real cluster (lab cluster) and it's not there (although our distribution on that cluster is all out of whack.. presumably for other reasons). sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html