Re: packet priorities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



| I'd like to write a patch to kernel that would allow dccp packets to be sent 
| according to priorities. There are a few things that might be worth 
| discussing.
Excellent - you have actually hit on a major problem which is still
unresolved in the API.

This has wider scope and thus is important to resolve. You didn't write
for which purpose you wanted to use priorities, but the concept of
keeping the prioritisation scheme is very good.

The problem is that the socket API is "weird": a TCP/UDP socket would
simply block until one can send a packet. DCCP may block because it is
doing congestion control. Currently the difference to normal sockets API
is that Linux DCCP uses a type of "port" (in operating systems terms):
the application can fill this port with data until it is told "EAGAIN"
(port busy). 
This is insufficient for real-time data (which may become too old) and
I am guessing that this is where your prioritisation ideas come in.								      

The only existing approaches I know of are
 1. Ian's patches which communicate an expiry time to the kernel
    http://www.wand.net.nz/~iam4/
    Ian keeps his best-packet-next algorithm as an experimental patch set,
    but I can see useful points - in particular the idea of passing the
    expiry time as ancillary data (cmsghdr).
 2. There was an early implementation by Lai/Kohler
    http://www.cs.ucla.edu/~kohler/pubs/lai04efficiency.pdf
    but this is more of a conceptual model, as it shares memory regions
    between kernel and user space. The only way I can see of
    implementing this would be mmap() with additional primitives to
    protect the shared areas. Maybe there is a smarter way.
    This used a 2-priority scheme: enqueued packets are either `live'
    or `dead'; and the application can modify packets it already 
    enqueued.

My feeling is that, while worth exploring, (2) is more complex to
implement (mmap() call), but is in principle interesting. 

Therefore I think that your idea and Ian's approach are better feasible.
It may take some iterations to make the API fully usable, but it is time
to start this. I agree with many of your points, comments below.
    

| 1. The patch should not change default kernel behaviour. That is prioritizing 
| should be turned on explicitly not to break existing applications.
I'd even say it could risk to break existing applications, since the API
is not a particular good one at the moment.

| 2. The mechanism should be CCID independent.
Yes.
| 3. For now I plan to add only priorities. But I can imagine that other 
| criteria might be useful (for example expiry times as proposed by Ian's 
| experimental patch). This makes it necessary to think of a way to specify 
| queuing and dequeuing method. Should it be set per socket or per packet?
For the queuing method changing the policy on a per-packet basis means a
lot of overhead, so a per-socket policy seems reasonable.

| 4. How fast should it be in terms of computational complexity? Is O(n) 
| acceptable, where n is the number of packets in queue? Or should I make it 
| O(m), where m is number of priorities in currently in queue? Or should I 
| think of something faster?
This is a good thought, for me the question "what is communicated and
how" is almost as important.

| 5. Should the number of packet priorities be hard limited? I can't imagine 
| using more than 8 bands, so maybe limiting to about 16 different priorities 
| would be ok?
It would be great if the design would allow different types of policies,
i.e. "earliest-packet first", the limit of priorities can also be
configured via a Kconfig option, so it is not a big deal.

| 6. Packets with lowest priorities should be discarded so as not to exceed 
| configured queue length.
I am interested to make this more precise, since this is exactly the
problem which currently happens in applications: 
 * media servers which need to serve a streaming packet before a given 
   deadline
 * traffic generators such as D-ITG which likewise need to "get a packet
   out" within a given time bound (they have pre-computed inter-packet
   gaps which are determined as random variables).


| Would such a patch be accepted in mainline kernel? Of course after discussing 
| the ideas and implementation details. Thanks in advance for your input,
This depends on Arnaldo's decision. From experience, experimental or new
features take a little longer, but this should by no means be a discouragement.

In the meantime, I would be more than happy to allocate space and/or a tree on
as part of the test tree,
http://www.linux-foundation.org/en/Net:DCCP_Testing#Experimental_DCCP_source_tree

which would be kept in synch with the netdev tree.


Thanks for the input
Gerrit
--
To unsubscribe from this list: send the line "unsubscribe dccp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [IETF DCCP]     [Linux Networking]     [Git]     [Security]     [Linux Assembly]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux