Re: QoS with Artifficial Intelligence

Jonathan Day <imipak@xxxxxxxxx> · Mon, 20 Dec 2004 10:21:37 -0800 (PST)

I'm guessing the "AI" bit is a simplified way of
expressing what they're after. "AI", per se, is
meaningless, because it's undefined.

What I -think- they want to do is examine the current
behaviour of the traffic, anticipate how it is going
to behave next, set the QoS to match that expectation,
and then "learn" both from what actually happens and
from the quality of traffic flow produced.

A self-adjusting QoS is a tough problem, and I'm not
aware of anyone who has done much research into such
things. One of the problems is that traffic flow is
random, rather than periodic or constant. There is no
obvious way, at the start, to tell if a given transfer
is going to be large or small. Also, you can't just
pick a certain set of variables to change, because the
values are highly interdependent. You've got to change
them all, and that makes the problem much more
complex.

A much better approach would be to look at QoS over
the network, rather than at a single point. This is
because optimising a single point can make some
subsequent point perform worse. What you want is to
optimise the system in totality.

On a relatively small network, this is relatively
easy. Just have all the routers periodically transmit
their current settings, the statistics per interface
per traffic class (you don't care about the source for
this), the router load and the estimated latency &
packet loss. This data goes to a central server, which
determines the settings likely to work best in the
future.

Because we're dealing with relatively large pools of
aggregated random data, we can apply statistical
techniques. I'd start off with looking at queueing
theory, which deals with the forming and processing of
a linear series of random events which need
processing. This should be able to tell you how large
a bucket you want for each class. (ie: the hard
limit.)

Calculating the optimal number of classes is harder,
but you do know that the sum of the upper soft limits
for all classes must be equal to or less than the
capacity of the router. To me, that suggests you might
be able to get a good guess for the soft limits via
the SIMPLEX method (also known as Operational
Reseaarch).

Once you know the soft and hard limits, you can
determine the number of classes by queueing theory -
it is the minimum number of "queues" into which you
need to split the traffic to get maximum throughput,
avoiding "empty" queues.

This approach would not work for a single router. The
traffic is "random", but it is not random enough, and
statistics doesn't work well on single points. It will
also fail on very large networks, because the overhead
of transmitting the metadata would become too large,
and by the time the data was processed, the results
would no longer have much meaning.

For very large networks, you could "escape" the
problem by regarding it as a large collection of
overlapping medium-sized networks. You could then
process each of the medium-sized networks using the
above method. Where two (or more) manager nodes
instruct a specifc router, the router would take the
average recommended values. (If you know in advance
that one of the managers is more relevent than
another, then simply weight the average accordingly.)

***WARNING***

All of the above is speculative, in the sense that it
-should- work, but I don't have a large enough test
network to verify it. Nor do I know the optimum number
of routers/hosts where the numbers are statistically
meaningful, yet where the metadata doesn't interfere
with the traffic flow AND where the results can be
passed back and acted on within the timeframe for
which they are valid.

I say the above -should- work, because there are
methods for solving the various parts of the
problem-space. If you combine these methods correctly,
you'll end up with a solution to the whole problem
space.

Now comes crunch #1. Although traffic flow is random,
in aggregate, it is not necessarily random when split
into classes. Certain events (eg: backups over a
network, connecting to a DHCP server on power-up, etc)
are mostly going to occur at specific times. You could
always complicate the manager nodes, by adding a diary
of known large-scale events, so that it can statically
allocate the correct bandwidth for those and then
dynamically allocate whatever is left.

Crunch #2. Statistical methods, herustics, etc, are
generally slow. Changes in network behaviour can be
fast. To be meaningful, the results have to be
calculated and passed back to the routers so they can
update their QoS methods before the traffic has
changed significantly.

Crunch #3. Probably the biggest problem of all.
Transmitting the metadata and then getting the updated
QoS information is going to take up bandwidth. This is
going to alter the flow. If you're lucky, the change
will be short-lived. If you're unlucky, the knock-on
effects (eg: resent packets, changes in
load-balancing, etc) will disturb the pattern
significantly and unpredictably, making the new QoS
parameters useless.

Crunch #4. This whole system is a set of feedback
loops, which aim to produce a net negative feedback
system. Because you don't know how traffic will
actually change with time, any individual QoS loop may
become a positive feedback loop. The system, as a
whole, is therefore "meta-stable" - it's stable, but
only within certain limits. Behaviour outside of those
limits could potentially crash the whole system.

Pro-active QoS is certainly possible. However, I have
absolutely no idea as to how you could build a
pro-active QoS that was both stable AND responsive in
the sorts of times a network would need to respond in.

--- Ed Wildgoose <lists@xxxxxxxxxxxxxx> wrote:

> 
> >My idea is to set up a daemon to run QoS on linux,
> with a particularity, add
> >some A.I. capabilities to our system and hence, be
> able to change QoS
> >"topology" every certain time to obtain the maximum
> performance.
> >
> >I first want to teach the system which parameters
> should i vary, and hence i
> >would like all of you to tell me, which do you
> think i should change.
> >  
> >
> 
> The paramters to vary are easy enough, after all
> your are simply 
> segmenting the network traffic by type and then
> throttling it to some 
> lower proportion of the total network capacity.
> 
> Your problem is determining the "fitness" function
> that you are 
> optimising?  After all if you can describe the
> fitness function in 
> enough detail then you can simply implement an
> optimal traffic control 
> function to inplement that desired policy... In
> other words I'm not sure 
> where the AI bit would fit in?
> 
> Good luck
> 
> Ed W
> _______________________________________________
> LARTC mailing list / LARTC@xxxxxxxxxxxxxxx
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO:
> http://lartc.org/
> 

__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250
_______________________________________________
LARTC mailing list / LARTC@xxxxxxxxxxxxxxx
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/

Re: QoS with Artifficial Intelligence

Linux Advanced Routing and Traffic Control