Pupus pipeline: what Adam has been doing, etc. etc.

"Adam D. Moss" <adam@xxxxxxxx> · Thu, 21 Dec 2000 23:15:17 +0000

Right.  If anyone knows or remembers who I am, they might
wonder what I've been up to for the past six months
since GimpCon 2000.  =)  If so, thanks for caring -- sit
back and I'll tell you!

Primarily, I'll admit, I've been busy with my super-mundane
day job, and diffused much of my remaining time with scattered
hackings.

/(** %<   Oh, bad dog.  Really.  Quite awful.
 '  `

GIMPwise however, apart from minor ambient maintainence and
musings I have been busy with two things:

1) "pquant", a terrifying colour-reduction algorithm probably
doomed to perpetual experimentation.

2) "pupus", an image-processing scheduler and propogation
framework.  This is squarely aimed at GIMP 2.0.

I'd mostly like to explain what "pupus" is about.  The name is a
working title and is short for "PUll-PUSh".  The project has
grown out of the ideas I hashed together in the airport waiting
to fly to GimpCon 2000 and attempted to present for about
eight hours (or four minutes when you factor out the rabbit-in-
headlights panicking; never fear that I shall do a presentation
again).

If you're not familiar with the original proposal then shame
on you!  I *slap* you!  Yet I cannot blame you, and it's okay because
things have changed a great deal.
      ,
'(OOOO)') baaa      That's a sheep to make sure you're still awake.
 || ||              She'll be keeping an eye on you.  Be wary.

The somewhat-simplified idea common to both proposals is that a
list/tree of little black boxes is set up, where images get
fed into the tree at the bottom, get chewed up by the black boxes
through which they are sequentially sent, and at the end of
the line comes a result.  If you think of the black boxes as
analogous to plug-ins or compositing operators then you'll see
that you've basically got a generalized way for a program to
conceptually project a layer-stack, spin an image around and
blur it -- whatever.  Have the right black boxes at hand, connect
them up just /so/, push in the desired source image(s) and wait
for your beautiful beautiful output to spew forth from the end
of the chain.

In reality the devil is, as always, in the detail.

What, exactly, are we feeding into these black boxes?  Whence?
By what mechanism?  Who owns these 'images' that we're
transferring around?  What constitutes a black box, both physically
and in terms of the interfaces used to poke it with?  How would
we, say, tell a 'blur' box what radius of blur we desire?

How do we know when we've connected a black box's inputs and
outputs 'right'?  Can we set up a cyclic graph within the system?
What happens if we do?  In what order do things happen?  How would we
facilitate incremental rendering?  Can we retroactively revise
data already pushed into the pipeline?  Who is the man behind the
curtain?  How can we improve the user experience?  How do you stop
this crazy thing?

 _ ||_||
8: ) _  )~    The roadkill pig of puzzlement knows not.
 ~ || ||

The list is much longer than that.

Well, now I have a revised design and honest-to-goodness embryonic
prototype code, taking into account comments and suggestions from
GimpCon 2000 and various ideas from the intervening six months.

In difference to the earlier proposal:

1) We're not going crazy on the resource-contention-avoidance malarky.
Hopefully that just drops out as a natural side-effect of the resource
ownership model.  There is no explicit resource-lockdown upon black-box
startup.

2) This time we support, nay, encourage in-place rendering and minimized
copying where plausible.

3) We're a lot friendlier towards black boxes who can't/won't work
on a 'regions on demand' basis.

4) Aborting a task pipeline is easier.

5) Changes to geometry (width, height, offsetting) figure into the
grand scheme.

6) We can spontaneously invalidate image regions from upstream while
they are still being processed downstream.

7) Latches and feedback-loops within the system might be facilitated
with a little more effort.  Some of the possibilities seemed too cool
to pass-up.

. o O () O o . o O () O o . o O () O o . 

As the implementation stands,

1) We are toolkit-agnostic.  At the core we deal with tasks and
resources, not a user-interface.

2) We are transport-agnostic.  Only one transport-type is implemented
so far and even then not as cleanly as I'd like, but in theory we
can quite easily invoke these 'black boxes' (called 'steps' within the
code) on remote machines via CORBA or Convergence's GCim (?).

3) Black boxes are instantiated from factories implemented as .so files.
These are dynamically discovered at runtime.  These are currently
dynamically-linked to the main application at discovery-time but (in
theory...) can trivially be dynamically-linked to an alternative
transport
shim and hence run from within a different address space or indeed a
different physical machine.

4) A few black boxes have been written for testing purposes.  All
interfaces are continually in flux and are slowly being pared down to
their essentials.

5) The basics of the pupus kernel/scheduler (the 'step manager') are
written; this is responsible for collating and organising work requests,
and is also responsible for tracking the resources afloat in the
system and servicing region read/write requests.

6) Pupus is colourspace, pixel-format and storage agnostic.  That's
Someone
Else's Problem; we deal with high-level notions of 'surfaces',
regions thereof and similar.  The pixel-molesting black boxes have to
care, of course; at the moment for testing purposes RGB24 is a blanket
assumption.  This is where GEGL comes in, I think.

7) There are some lovely pipeline optimization opportunities that are
becoming apparent and can later be implemented at leisure without
affecting
correctness.
  __  __
 (  `'  )   ***** Still reading?  I love you!
  \    /
   `.,'

Of the several black boxes ('steps') written for testing purposes, three
of them sort of _do_ something right now.

One box takes no inputs and emits an image read from a specific
PPM file.

Another box implements an RGB24 viewer written for GTK; it takes an
image
as input and has no outputs (but as a side-effect it, of course,
*shows* whatever is being fed into it).

A third box writes a solid random colour wherever it is asked to draw
something, and invalidates the entirety of its own output at every
opportunity.  It is an example of a dynamically-changing input;
were you to connect the 'viewer' box to it you would see a continually-
flashing square.

So, as a trivial exercise we can create a 'ppm' box, connect
it to a 'view' box, and, well, we have a ppm viewing app.  Well, duh.

  stepFactory *ppm_factory;
  stepFactory *view_factory;
  stepInstanceID ppm_step_id;
  stepInstanceID view_step_id;

  ppm_factory  = step_manager_factory_by_name("PPM loader");
  ppm_step_id  = ppm_factory->step_instantiate();

  view_factory = step_manager_factory_by_name("RGB24 viewer");
  view_step_id = view_factory->step_instantiate();

  step_manager_connect (ppm_step_id, 0,
          view_step_id, 0); /* connect ppm output #0 to viewer input #0
*/
  /* hook the step-manager scheduler up to your favourite idle thread */

How many copies of the PPM's image data exist?  One.  The viewer
app doesn't even keep a back-buffer -- GTK exposure events are turned
into 'pull' requests from the upstream viewer so I could stress-test
things a little; transitory surface area buffers are created, passed
upstream for the PPM loader to fill, then passed downstream again,
committed to screen, and destroyed.

You can connect any number of observers to a given output.  I connected
200 independant viewers to the PPM loader.  How many copies of the
raw image's data were in the system at the end of the day?  Hah, trick
question -- still one.  That number could technically be zero were the
PPM
loader to reload (at least mmap()) the PPM data on demand -- we'll leave
that to the space cadets.

This email doesn't begin to cover the possibilities.
These are still baby steps -- there's a lot to do!  Let's look to
the future.

Where would such a thing fit into the GIMP scheme?  The first, but
uninteresting answer to spring to mind is 'oh, here and there
really, sounds like an unnecessarily complicated way to run plugins
but could be okay for composing the layer-stack, or something'.

*****    ****   **   *  **
**   *  **   *  **   *  **
*****   ******  ******  **
**   *  **   *  **   *
*****   **   *  **   *  **

No, that's just not thinking big enough!  Don't feed images
into black boxes; feed trees of black boxes into black boxes.  Don't
take the output of a tree of black boxes and put it into an image
buffer -- what good is it doing sitting in an image buffer?  Where is
*that* image buffer ending its days?  On-screen, for example?  Then
heck,
turn your canvas into a black box in its own right, add it as an
observer
and it can join the fun.

"But Adam," you whine, "if these trees don't transitorily exist
only long enough to get things done, but instead grow to bizarre
complexity so, like, even a layer sitting there in my image is
this tree of actions in its own right, well, won't recalculating
the whole world back to the conception of the universe just because
I exposed the damn window be, maybe, a little slow?"

I finger my laughable attempt at a goatie-beard at this and appear to
consider the matter, but am merely toying with you -- sucker!

The step-manager 0wns you.  It runs your pipeline.  It instruments
your pipeline.  The punchline: adorable little black boxes automatically
dynamically-interposed at the most-stressed portions of the pipeline --
cache-boxes.  They remember image data that passes through them
from upstream and serves that to any downstream pulls of image
data, until the upstream source happens to invalidate that region.
These can move, and grow and shrink their memory pools according to
demand.  They're actually in theory normal black boxes conforming to the
usual API, they just happen to be one of a small set of black boxes
which
the step-manager knows how to utilize for its own nefarious gain.

So, in this example, pipeline-stress is coming from expose events?
The cache-boxes would move to nestle right up under your canvas-box,
so the fully-rendered final image would be one pipeline-step away.

Now, this can be taken too far.  I'm actually a bit uneasy about
leaving everything in pipeline-form from image-open through to
image-save -- but hey, it's up to you as the higher-level app
programmers.  I can build the bomb, you can drop it.

How would the "pupus" functionality be directly exposed to users?  The
answer is that it most assuredly WOULD NOT.  I do not advocate, in fact
I ABHOR the idea that the user should end up drawing a little tree
of boxes connected with wires.  That's your view as a programmer
(and even then there are likely to be a few utility layers of API
between you and the raw pupus pipeline constructor when I've had
my way), but your responsibility as an application designer is to
map a more user-friendly concept (good old layers, images, layer
masks, brushes et al) to this back-end.

A note on interactive pipeline stages such as brush-painting,
smearing etc: There are lots of ways to do this and I'd really
like to experiment with them all before commenting.  =)

Going forward, I have, as usual, very minimal spare time so this
message is a (not terribly) compressed view of several months both
in the past and future.  When I can begin to demonstrate a small
GIMP-like app built upon the pupus pipeline I will issue a
code-drop.  If I can't do that then it's worthless and I give up!

For those who have an upcoming holiday, have a good one!  See you
in January.

--Adam
-- 
Adam D. Moss    . ,,^^    adam@xxxxxxxx    http://www.foxbox.org/