RE: [LARTC] Intelligent P2P detection

"Luman" <llu@xxxxx> · Thu, 27 Mar 2003 10:24:08 +0100

>Assumptions:
>  Determine and mark 'good traffic' -- i.e. smtp, ftp, http, ssh, etc.,
>  everything which uses well known ports.  Probably most people do it
>  anyway, at least to some level.

The problem is with that currently P2P soft often use these well known
ports too. So the assumption that port 80 is only for HTTP is bad. 

The intention of the bringing forward my problem is to open wider
discussion aimed to find or if need be to build a "tool" (it might be a
kernel patch, or whatever), which can thoroughly analyze traffic and its
content, and on the base of it can take a decision (likely not with 100%
likelihood) what is the content. For instance it can detect that the
traffic is HTTP even if it is sent to 46723 port, basing on the content
of data. Such tool should based on a modular architecture allowing
adding new testers or new knowledge trying to guess the protocol.
Obviously, it should track connections, session and everything what can
be used to traffic classification. 
As the result packets would be marked by a standardized number
determining type of a protocol, for instance HTTP, KaZaa, MSN, SSH, SCP
etc. This knowledge could be used to traffic shaping and whatever. Can
you imaging the comfort of administrators' work if at the border router,
or at the firewall configuration, can work with this well determined
content. Number or rules would be reduced dramatically. Obviously, the
classification knowledge would be growing day by day.

Whole idea is very similar to Unix 'file' command. For instance I had on
my system "a.gz" file. The type of this file is obvious this is gzip.
However, I changed its name to "a.txt". It should suggest that this is
text file, however, when I run file a.txt I get the fallowing answer:
~# file a.txt 
a.txt: gzip compressed data, deflated, original filename,
`ucspi-tcp-0.88.tar', last modified: Sat Mar 18 16:21:39 2000, max
compression, os: Unix.
This program doesn't care about extensions it tries to guess the type by
analyzing content. Of course many times it gives wrong answer, but that
is related to weak of knowledge. 

Summarizing my pretty long mail, I think our present methods are similar
to determining the content of file basing only on extension of its name.
But I believe we strongly require something more. 

>
>  All what is left are P2P connections and some other misc connections.
>  A bit unfair for other protocol using non-standard ports, like
Instant
>  Messenger style-software, and lots of other stuff.  So here we
introduce
>  a trick.  IMs and other low bandwidth traffic will use small packets
>  ( < 512 or even < 256), P2P will use maximum MTA available (usually
>  1500, but I've seen some using 576 packets, hence i treat > 512 as
P2P).
>
>  Probably you've notices that I mention round numbers, as 512 or 1024,
>  that's because I use u32 for marking packets.  How I do it, we leave
>  as an exercise to the reader. ;-)))

I like your solution very much. I'll try to apply it for my system, as a
temporary solution. 

Thank you for your voice in this discussion.

Best regards,
Luman

RE: [LARTC] Intelligent P2P detection

Linux Advanced Routing and Traffic Control