----- "Lucas Meneghel Rodrigues" <lmr@xxxxxxxxxx> wrote: > On Sun, 2010-02-14 at 12:07 -0500, Michael Goldish wrote: > > ----- "Lucas Meneghel Rodrigues" <lmr@xxxxxxxxxx> wrote: > > > > > As our configuration system generates a list of dicts > > > with test parameters, and that list might be potentially > > > *very* large, keeping all this information in memory might > > > be a problem for smaller virtualization hosts due to > > > the memory pressure created. Tests made on my 4GB laptop > > > show that most of the memory is being used during a > > > typical kvm autotest session. > > > > > > So, instead of keeping all this information in memory, > > > let's take a different approach and unfold all the > > > tests generated by the config system and generate a > > > control file: > > > > > > job.run_test('kvm', params={param1, param2, ...}, tag='foo', ...) > > > job.run_test('kvm', params={param1, param2, ...}, tag='bar', ...) > > > > > > By dumping all the dicts that were before in the memory to > > > a control file, the memory usage of a typical kvm autotest > > > session is drastically reduced making it easier to run in smaller > > > virt hosts. > > > > > > The advantages of taking this new approach are: > > > * You can see what tests are going to run and the dependencies > > > between them by looking at the generated control file > > > * The control file is all ready to use, you can for example > > > paste it on the web interface and profit > > > * As mentioned, a lot less memory consumption, avoiding > > > memory pressure on virtualization hosts. > > > > > > This is a crude 1st pass at implementing this approach, so please > > > provide comments. > > > > > > Signed-off-by: Lucas Meneghel Rodrigues <lmr@xxxxxxxxxx> > > > --- > > > > Interesting idea! > > > > - Personally I don't like the renaming of kvm_config.py to > > generate_control.py, and prefer to keep them separate, so that > > generate_control.py has the create_control() function and > > kvm_config.py has everything else. It's just a matter of naming; > > kvm_config.py deals mostly with config files, not with control > files, > > and it can be used for other purposes than generating control > files. > > Fair enough, no problem. > > > - I wonder why so much memory is used by the test list. Our daily > > test sets aren't very big, so although the parser should use a huge > > amount of memory while parsing, nearly all of that memory should be > > freed by the time the parser is done, because the final 'only' > > statement reduces the number of tests to a small fraction of the > total > > number in a full set. What test set did you try with that 4 GB > > machine, and how much memory was used by the test list? If a > > ridiculous amount of memory was used, this might indicate a bug in > > kvm_config.py (maybe it keeps references to deleted tests, forcing > > them to stay in memory). > > This problem wasn't found during the daily test routine, rather it was > a > comment I heard from Naphtali about the typical autotest memory > usage. > Also Marcelo made a similar comment, so I thought it was a problem > worth > looking. I tried to run the default test set that we selected for > upstream (3 resulting dicts) on my 4GB RAM laptop, here are my > findings: > > * Before autotest usage: Around 20% of memory used, 10% used as > cache. > * During autotest usage: About 99% of memory used, 27% used as > cache. Before autotest usage, were there any VMs running? 3 dicts can't possibly take up so much space. If it is indeed kvm_config's fault (which I doubt), there's probably a bug in it that prevents it from freeing unused memory, and once we fix that bug the problem should be gone. > So yes, there's a significant memory usage increase, that doesn't > happen > using a "flat", autogenerated control file. Sure it doesn't make my > laptop crawl, but it is a *lot* of resource usage anyway. > > Also, let's assume that for small test sets, we can can reclaim all > memory back. Still we have to consider large test sets. I am all for > profiling the memory usage and fix eventual bugs, but we need to keep > in > mind that one might want to run large test sets, and large test sets > imply keeping a fairly large amount of data in memory. If the amount > of > memory is negligible on most use cases, then let's just fix bugs and > forget about using the proposed approach. > > Also, a "flat" control file is quicker to run, because there's no > parsing of the config file happening in there. So, this control file Agreed, but on the other hand, the static control file idea introduces an extra preprocessing step (not necessarily bad). > generation thing makes some sense, that's why I decided to code this > 1st pass attempt at doing it. > > > - I don't think this approach will work for control.parallel, > because > > the tests have to be assigned dynamically to available queues, and > > AFAIK this can't be done by a simple static control file. > > Not necessarily, as the control file is a program, we can just > generate > the code using some sort of function that can do the assignment. I > don't > fully see all that's needed to get the job done, but in theory should > be possible. It sounds like you're suggesting to do the assignment statically, at control file generation time. AFAIK a static assignment is no good, because we have no way of knowing how long each test will take. If you're suggesting to have the generated control file perform dynamic assignment at run time, then I don't see how this can be done easily without loading the entire test list into memory. The parallel test scheduler needs the test list in memory, unless we're willing to make a special effort to keep most of the list on disk at all times. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html