Async func calls, this time with feeling.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks.

I've made some improvements to the async engine as well as the multiprocess engine, and as a consequence (and by design), XMLRPC faults are no more.

Returns that indicate errors look simply like this:
[ "REMOTE_ERROR", "name of exception", "value of exception", "textual Python traceback" ]

That needs some explanation

------

So, try the following

ipython
import func.overlord.client as fc
fc.Client("*").test.explode()

You'll get something like this:

{'mdehaan.rdu.redhat.com': ['REMOTE_ERROR',
                           'exceptions.Exception',
                           'khhhhhhaaaaaan!!!!!!',
' File "/usr/lib/python2.5/site-packages/func/minion/server.py", line 132, in __call__\n rc = self.__method(*args)\n File "/usr/lib/python2.5/site-packages/func/minion/modules/test.py", line 29, in explode\n raise exceptions.Exception("khhhhhhaaaaaan!!!!!!")\n']}

---------

That's exactly what we want, as that allows things like FuncWeb to present the errors nicely (especially in Async mode) but also for the caller to know exactly what happened on the remote end.

Naturally if we want to target only one system, it is simpler to do and we don't get the extra hash.

Note that we've intentionally avoided special formatted return objects for "normal returns" which is something I wanted to avoid having made the mistake of implementing something similar for Virt-Factory and hating it. That's also important as we want to allow non-Python clients to speak to Func/XMLRPC without needing to implement a lot of magic code to make results actually usable. (I think we're going to be seeing a lot more
cross-language usage of Func in the future.)

----------

fc.Client("mdehaan.rdu.redhat.com",noglobs=True).test.explode()

What we get then is basically the same, without the extra record that says where the result came from.

['REMOTE_ERROR',
'exceptions.Exception',
'khhhhhhaaaaaan!!!!!!',
' File "/usr/lib/python2.5/site-packages/func/minion/server.py", line 132, in __call__\n rc = self.__method(*args)\n File "/usr/lib/python2.5/site-packages/func/minion/modules/test.py", line 29, in explode\n raise exceptions.Exception("khhhhhhaaaaaan!!!!!!")\n']

---------

If a GUI was presenting this error, they would be interested most likely in just printing the 'khhhhhhaaaaaan!!!!!!' part, unless the task required a 3 dimensional solution :)

So, about async, how's that work? Easy enough.
Let's do something simple... ask all of our nodes to sleep for 10 seconds.

Here's a demo program adapted from test_async.py in the source tree
(Don't do this in iPython BTW, the forks confuse it.)

------------

import time
import sys
import func.overlord.client as fc
import func.jobthing as jobthing

TEST_SLEEP = 10 # seconds

client = fc.Client("*",nforks=10,async=True)
t1 = time.time()
job_id = client.test.sleep(TEST_SLEEP)

while True:

  # check on our status

  delta = time.time() - t1
  (code, results) = client.job_status(job_id)

  # print out our status as we get it in

  if code == jobthing.JOB_ID_RUNNING:
      print "task is still running..."
  elif code == jobthing.JOB_ID_ASYNC_PARTIAL:
print "task reports partial status, %s elapsed, results = %s" % (delta,results)
  elif code == jobthing.JOB_ID_FINISHED:
print "(non-async) task complete, %s elapsed, results = %s" (delta,results)
      sys.exit(0)
  elif code == jobthing.JOB_ID_ASYNC_FINISHED:
print "(async) task complete, %s elapsed, results = %s" % (delta, results)
      sys.exit(0)
  else:
      print "job not found: %s, %s elapased" % (code, delta)

  # be nice and don't poll a whole lot
  time.sleep(1)

-------

So in the above example, what we see are partial results, with the results from each node being available in "results" (which is a system-by-system hash) as they come in, even if they come in asynchronously. Anything that is an error also comes back in results which you can check with

func.utils.is_error(result) -> True | False

So there is no try/except things to worry about, which is kludgy in asynchronous usage anyway.

Right now the above code is not used by anything, though I think the first logical consumer is FuncWeb, which should be a great place to try this out if we get it addressing multiple systems.

---

So what exactly does async and multiprocess imply?

If we have 100 machines and 10 forks, that's 10 requests to initiate the sleep per fork. Requests to minions whether async or not are threaded, so multiple minion requests can be outstanding Yet the return from the minion regardless of time is immediate (these are minion job id's, somewhat cleverly hidden from the caller in the above) And from them the caller perspective, other than requesting the feature, the calling signature of the code does not give them any knowledge of the fact that the code is doing very crazy things with forks and minion job id's behind the scenes.

So when async is on, it's async on both ends... the async stuff is actually independent from the multiprocess stuff, but they are usable at the same time. Thankfully, not everyone needs to know how this works and we keep the API calling signature fairly straight forward... the main interest in being a way to surface partial results as they come in, and also to represent exception data in something more useful for display in apps
that need to do error handling.

Yipee.

--Michael




_______________________________________________
Func-list mailing list
Func-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/func-list

[Index of Archives]     [Fedora Users]     [Linux Networking]     [Fedora Legacy List]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]

  Powered by Linux