On Tue, Oct 30, 2012 at 01:01:54PM +0530, Raghavendra K T wrote: > On 10/30/2012 12:04 PM, Andrew Jones wrote: > >On Tue, Oct 30, 2012 at 11:27:52AM +0530, Raghavendra K T wrote: > >>On 10/29/2012 11:24 PM, Peter Zijlstra wrote: > >>>On Mon, 2012-10-29 at 19:37 +0530, Raghavendra K T wrote: > >>>>+/* > >>>>+ * A load of 2048 corresponds to 1:1 overcommit > >>>>+ * undercommit threshold is half the 1:1 overcommit > >>>>+ * overcommit threshold is 1.75 times of 1:1 overcommit threshold > >>>>+ */ > >>>>+#define COMMIT_THRESHOLD (FIXED_1) > >>>>+#define UNDERCOMMIT_THRESHOLD (COMMIT_THRESHOLD >> 1) > >>>>+#define OVERCOMMIT_THRESHOLD ((COMMIT_THRESHOLD << 1) - > >>>>(COMMIT_THRESHOLD >> 2)) > >>>>+ > >>>>+unsigned long kvm_system_load(void) > >>>>+{ > >>>>+ unsigned long load; > >>>>+ > >>>>+ load = avenrun[0] + FIXED_1/200; > >>>>+ load = load / num_online_cpus(); > >>>>+ > >>>>+ return load; > >>>>+} > >>> > >>>ARGH.. no that's wrong.. very wrong. > >>> > >>> 1) avenrun[] EXPORT_SYMBOL says it should be removed, that's not a > >>>joke. > >> > >>Okay. > >> > >>> 2) avenrun[] is a global load, do not ever use a global load measure > >> > >>This makes sense. Using a local optimization that leads to near global > >>optimization is the way to go. > >> > >>> > >>> 3) avenrun[] has nothing what so ever to do with runqueue lengths, > >>>someone with a gazillion tasks in D state will get a huge load but the > >>>cpu is very idle. > >>> > >> > >>I used loadavg as an alternative measure. But the above condition > >>poses a concern for that. > >> > >>Okay, now IIUC, usage of *any* global measure is bad? > >> > >>Because I was also thinking to use nrrunning()/ num_online_cpus(), to > >>get an idea of global overcommit sense. (ofcourse since, this involves > >>iteration over per CPU nrrunning, I wanted to calculate this > >>periodically) > >> > >>The overall logic, of having overcommit_threshold, > >>undercommit_threshold, I wanted to use for even dynamic ple_window > >>tuning purpose. > >> > >>so logic was: > >>< undercommit_threshold => 16k ple_window > >>>overcommit_threshold => 4k window. > >>for in between case scale the ple_window accordingly. > >> > >>The alternative was to decide depending on how ple handler succeeded in > >>yield_to. But I thought, that is too sensitive and more overhead. > >> > >>This topic may deserve different thread, but thought I shall table it here. > >> > >>So, Thinking about the alternatives to implement, logic such as > >> > >>(a) if(undercommitted) > >> just go back and spin rather than going for yield_to iteration. > >>(b) if (overcommitted) > >> better to yield rather than spinning logic > >> > >> of current patches.. > >> > >>[ ofcourse, (a) is already met to large extent by your patches..] > >> > >>So I think everything boils down to > >> > >>"how do we measure these two thresholds without much overhead in a > >>compliant way" > >> > >>Ideas welcome.. > >> > > > >What happened to Avi's preempt notifier idea for determining > >under/overcommit? If nobody has picked that up yet, then I'll go ahead and > >try to prototype it. > > Hi Drew, > > I had assumed my priority order as > 1) this patch series 2) dynamic ple window 3) preempt notifiers. > > But I do not have any problem on re-prioritizing / helping on these > as far as we are clear on what we are looking into. > > I was thinking about preempt notifier idea as a tool to refine > candidate VCPUs. But you are right, Avi, also told we can use > bitmap/counter itself as an indicator to decide whether we go ahead > with yield_to at all. > > IMO, only patch(3) has some conflict because of various approach we can > try.May be we should attack the problem via all 3 solutions at once and > decide? > > To be frank, within each of the approach, trying/analyzing all the > possibilities made the things slow.. (my end). > > Suggestions..? > I agree, it's a complex problem that needs lots of trial+error work. We should definitely work in parallel on multiple ideas. I'll go ahead and dig into the preempt notifiers. Drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html