Hi folks.
I've made some improvements to the async engine as well as the
multiprocess engine, and as a consequence (and by design), XMLRPC faults
are no more.
Returns that indicate errors look simply like this:
[ "REMOTE_ERROR", "name of exception", "value of exception",
"textual Python traceback" ]
That needs some explanation
------
So, try the following
ipython
import func.overlord.client as fc
fc.Client("*").test.explode()
You'll get something like this:
{'mdehaan.rdu.redhat.com': ['REMOTE_ERROR',
'exceptions.Exception',
'khhhhhhaaaaaan!!!!!!',
' File
"/usr/lib/python2.5/site-packages/func/minion/server.py", line 132, in
__call__\n rc = self.__method(*args)\n File
"/usr/lib/python2.5/site-packages/func/minion/modules/test.py", line 29,
in explode\n raise exceptions.Exception("khhhhhhaaaaaan!!!!!!")\n']}
---------
That's exactly what we want, as that allows things like FuncWeb to
present the errors nicely (especially in Async mode) but also for the
caller to know exactly what happened on the remote end.
Naturally if we want to target only one system, it is simpler to do and
we don't get the extra hash.
Note that we've intentionally avoided special formatted return objects
for "normal returns" which is something I wanted to avoid having made
the mistake of implementing something similar for Virt-Factory and
hating it.
That's also important as we want to allow non-Python clients to speak to
Func/XMLRPC without needing to implement a lot of magic code to make
results actually usable. (I think we're going to be seeing a lot more
cross-language usage of Func in the future.)
----------
fc.Client("mdehaan.rdu.redhat.com",noglobs=True).test.explode()
What we get then is basically the same, without the extra record that
says where the result came from.
['REMOTE_ERROR',
'exceptions.Exception',
'khhhhhhaaaaaan!!!!!!',
' File "/usr/lib/python2.5/site-packages/func/minion/server.py", line
132, in __call__\n rc = self.__method(*args)\n File
"/usr/lib/python2.5/site-packages/func/minion/modules/test.py", line 29,
in explode\n raise exceptions.Exception("khhhhhhaaaaaan!!!!!!")\n']
---------
If a GUI was presenting this error, they would be interested most likely
in just printing the 'khhhhhhaaaaaan!!!!!!' part, unless the task
required a 3 dimensional solution :)
So, about async, how's that work? Easy enough.
Let's do something simple... ask all of our nodes to sleep for 10 seconds.
Here's a demo program adapted from test_async.py in the source tree
(Don't do this in iPython BTW, the forks confuse it.)
------------
import time
import sys
import func.overlord.client as fc
import func.jobthing as jobthing
TEST_SLEEP = 10 # seconds
client = fc.Client("*",nforks=10,async=True)
t1 = time.time()
job_id = client.test.sleep(TEST_SLEEP)
while True:
# check on our status
delta = time.time() - t1
(code, results) = client.job_status(job_id)
# print out our status as we get it in
if code == jobthing.JOB_ID_RUNNING:
print "task is still running..."
elif code == jobthing.JOB_ID_ASYNC_PARTIAL:
print "task reports partial status, %s elapsed, results = %s" %
(delta,results)
elif code == jobthing.JOB_ID_FINISHED:
print "(non-async) task complete, %s elapsed, results = %s"
(delta,results)
sys.exit(0)
elif code == jobthing.JOB_ID_ASYNC_FINISHED:
print "(async) task complete, %s elapsed, results = %s" % (delta,
results)
sys.exit(0)
else:
print "job not found: %s, %s elapased" % (code, delta)
# be nice and don't poll a whole lot
time.sleep(1)
-------
So in the above example, what we see are partial results, with the
results from each node being available in "results" (which is a
system-by-system hash) as they come in, even if they come
in asynchronously. Anything that is an error also comes back in
results which you can check with
func.utils.is_error(result) -> True | False
So there is no try/except things to worry about, which is kludgy in
asynchronous usage anyway.
Right now the above code is not used by anything, though I think the
first logical consumer is FuncWeb, which should be a great place to try
this out if we get it addressing multiple systems.
---
So what exactly does async and multiprocess imply?
If we have 100 machines and 10 forks, that's 10 requests to initiate the
sleep per fork.
Requests to minions whether async or not are threaded, so multiple
minion requests can be outstanding
Yet the return from the minion regardless of time is immediate (these
are minion job id's, somewhat cleverly hidden from the caller in the above)
And from them the caller perspective, other than requesting the feature,
the calling signature of the code does not give them any knowledge of
the fact that the code is doing very crazy things with forks and minion
job id's behind the scenes.
So when async is on, it's async on both ends... the async stuff is
actually independent from the multiprocess stuff, but they are usable at
the same time. Thankfully, not everyone needs to know how this works and
we keep the API calling signature fairly straight forward... the main
interest in being a way to surface partial results as they come in, and
also to represent exception data in something more useful for display in
apps
that need to do error handling.
Yipee.
--Michael
_______________________________________________
Func-list mailing list
Func-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/func-list