Krzysztof A. Adamski wrote:
I've created async2 branch on my github account. This branch includes
all current fixes and changes and it's cleaned up so it can be
reviewed and merged. Here's the address:
http://github.com/kadamski/func/commits/async2
Full changes in one patch file can be dowloaded from:
http://szrot.ry2n.pl/async2.patch
Change statistics:
func/forkbomb.py | 25 ++++++++++------
func/jobthing.py | 45 ++++++++++++++++++-----------
func/overlord/base_command.py | 4 ++-
func/overlord/client.py | 2 +-
func/overlord/cmd_modules/call.py |
58+++++++++++++++++++++++++++++++++++-
test/async_test.py | 4 +-
6 files changed, 106 insertions(+), 32 deletions(-)
shows that there is no changes that could break things other than async
calls (which are already broken now).
Changes made in this branch are described below:
I. New features:
1. Added --async, --nopoll, --sort and --jobstatus commandline
options, as described on my earlier post:
https://www.redhat.com/archives/func-list/2008-May/msg00054.html
Commits:
http://github.com/kadamski/func/commit/ea70b80d0378b21da53166f76e796692ecf20b33
http://github.com/kadamski/func/commit/9de932b2608445f100565f4ee37d383d8dd72ff9
http://github.com/kadamski/func/commit/cbb7650e359f8870cd59501bc0e9904d5a9e6ea6
II. Fixes:
1. Bucket creation errors:
There was a simple error in modulus arithmetic which led to
ignoring results from first minion if the job was called on more
than one minion. Commits:
http://github.com/kadamski/func/commit/3a66e1bfe011bc8ffc42ad8dcda672c76af8352d
http://github.com/kadamski/func/commit/3f62e1d302fb63f1f3c5c72121388c8b0dd6e784
2. Huge memory leak:
I've found that bsddb bindings in python 2.5 are broken and they
leak a lot of memory. Switching to dbm fixed the problem. Commit:
http://github.com/kadamski/func/commit/bb80b6e6f10bd97b2926f447e913d8249592c32d
3. Zombies on minion:
After each async call, minion use fork to spawn async process.
Parent returns to it's normal operation and newer calls waitpid()
nor handle SIGCHLD signal. This means zombie process is created
after each async call. I've implemented one of the possible
solutions that use double-fork to daemonize async jobs. Side
effect is that async jobs won't be killed when funcd is shutdown,
instead they will run until they complete. Commit:
http://github.com/kadamski/func/commit/dee316020aaf6e70c909ea177adf2742a33e2b7b
Another solution is to keep list of child PIDs and call unblocking
waitpid() call from time to time to collect zombies. We have to do
this asynchronously to the normal operation not to led to many
zombies to be present for a long time. One solution is to do this
in SIGALARM handler and call signal.alarm(INTERVAL) each time.
There may be a problem with using alarm() and sleep() at the same
time so we need to ensure no one is using sleep() in main process.
This means we can't run any job (any code from modules) in main
process, instead we have to call it in a fork (which could be good
idea in general). I'll test this approach soon.
4. Concurrency in jobthing:
__update_status() was called just after fork in both parrent and
child. To ensure that this update won't be called AFTER finishing
the job but before, it was actually called twice, which is not
needed. Instead it can be called once but before fork. Commit:
http://github.com/kadamski/func/commit/721fa16f64627fcd5b0d0b5c4f9d9e382729bfa0
5. Handling remote errors with async calls:
The code in jobthing wasn't prepared for any remote errors and
throw exceptions in such case. It's fixed in:
http://github.com/kadamski/func/commit/fdabc175f9ef5ee9d1e6887409be58dd44cf661d
6. There was --forks command line parameter for 'call' cmd_module but
it wasn't actually used. This small patch allows us to use it:
http://github.com/kadamski/func/commit/3750980ea547c73a63cb456a85be0b14cc36c30c
III. Cleanups:
1. Jobthing status codes:
There was JOB_ID_FINISHED and JOB_ID_ASYNC_FINISHED which was
actually the same. NOTE: This change breaks the API. Commit:
http://github.com/kadamski/func/commit/2d23caf2f9e96022dd49781bc00c351dcab7c10c
2. Parameters name:
There is batch_run() function in both jobthing.py and forkbomb.py.
They share the comments about the parameters but somehow
jobthing.py was using different names which i think was confusing.
Here's the patch to fix this:
http://github.com/kadamski/func/commit/58439ecd3433434b8fa31821619fec8e5bbeda27
3. Unneeded print statement:
Just removes one print statement:
http://github.com/kadamski/func/commit/9acec083713038365e804464cfaf7c22a405d0a0
4. Returning unformated values from do() function (as i believe it's
more useful):
http://github.com/kadamski/func/commit/8e96ee4101213ccaba11c95729bf88171873d03c
As always I would like to ask you all to test and/or review the code.
All comments and/or bug reports are welcome.
_______________________________________________
Func-list mailing list
Func-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/func-list
Looks good, and thanks for the detailed summary.
I merged/pushed all of the above after briefly testing on a couple of
machines, async_test.py still looks good also.
Wider testing is definitely welcome (I'd be interested to know from
Steve Salesvan whether your RHEL4 issues are fixed with this), as are
other comments -- we'll just base changes on what's committed from now
if we need to tweak anything else.
Thanks! This will definitely be useful for things like powering things
like the async AJAX calls in FuncWeb too.
--Michael
_______________________________________________
Func-list mailing list
Func-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/func-list