Re: Max number of calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Craig,

thanks for your explanation.

3 points I'd like to raise, though.
Beware, all of the following is off the top of my head and I may be completely mistaken here
and my memory fails to serve me correct on the details.
Also, I am no programmer, so I don't know at all what I'm talking about :)
You're the expert.
Anyway.


First, the amount of sockets to listen on is pretty small.
I can run the same gk in h.225 and h.245 routed mode with thousands of connections
and in RTP proxy mode with just a little more than 100.
I didn't check the code so I'm not sure about the exact number of sockets that gnugk uses,
but it should be something like 4 for a call that is h225 and h.245 routed and 8 for a call that
is RTP proxied.
I'm unwilling to believe that this rather small increase in sockets cause that much increase in CPU
usage for select().



Second, I encountered the same c10k discussion when I tried to tune a Squid HTTP
proxy some (long) time ago.
More or less same problem there, thousands of connections need to be proxied.
Squid uses select(), too, and the ratio # connection vs. needed cpupower per connection
is even worse for HTTP compared to H.323/RTP, so the select() issue should have an
even higher impact in that case.


I vaguely recall that the outcome was that "nowadays", select() is no longer a bottleneck
because internal to many operating systems, it is implemented in a different way, similar to poll().
That was at least true for Solaris. I ain't sure about Linux, though.



Third, my figures just showed how cpu usage distributes over all syscalls.
I do also see, however, that about or close to half of the CPU is burned in userspace
and the other half in sysspace. select() should be counted as sys usage, right?


So while we may or may not be hit by c10k select() CPU burner, still
half of the CPU is used by somebody else.
And as gnugk just needs to copy one RTP stream into another, the corresponding code
is pretty simple and shouldn't use much CPU, so what's probably eating CPU is the
openh323 lib.
I know this is absolutely half-hearted guessing and I should use a real profiling program
to find out what it is really doing, but again, I'm no programmer.



Alright. Now I'm curious how little I did understand of this stuff. Please tell me :)


best regards,
Markus


Craig Southeren wrote:


Markus and David,

The problem is that Unix programs are just not designed to handle
hundreds of file handles in a single select statement. The "poll"
command was designed as a better, but not perfect, solution to that
problem.

There will always be a limit on the number of the file handles a single
process can address. Expanding it from the default of 1024 to 16384
certainly means that more file handles are available, but it also means
that more CPU time will is spent simply manipulating lists of FDs rather
than doing useful work (as demonstrated by Markus's figures). For a good
explanation of the tradeoffs of various approaches, see
http://www.kegel.com/c10k.html

From experience, the way to solve this problem is to use a single
gatekeeper with multiple seperate proxy processes. I personally have
seen a single 1.6Ghz Linux machine handle more than 400 simultaneous calls
with no problems using this approach. That was signalling only - doing
media as well would reduce the number of calls, but then again, using a
dual 3Ghz machine would increase it again dramatically.

A single gatekeeper process can easily handle thousands of registrations,
but there is no way it can also handle the call proxying for that number
of users at any useful utilisation ratio. And for systems that need to
support tens of thousands of users, with thousands of simultaneous calls,
a distributed system of proxies and GK will always be needed.

And yes, such a system does exist :)

Craig

On Thu, 15 Apr 2004 10:15:40 +0200
Markus Storm <markus.storm@xxxxxxxxxxxxx> wrote:



david winter wrote:



I understand that part, but is it just really hard to proxy the RTP? is there a bottleneck somewhere? is the code just not efficient enough? (not that i am knocking the code, this is great stuff.) but I dont see the value in proxying (barring doing it for NATs) this until i can get closer to 500 proxied RTP calls on a $3000 computer. this coming from the perspective of an international voip carrier, carrying LOTS of minutes.


I agree (same business, same problem here).

Interesting, though, that most of the CPU seems to be burned in select():
I've compiled with LARGE_FDSET enabled. Without that, gnugk burns even 100% CPU.



[root@lnxc-025:/usr/local/bin]$ strace -cp 22671
Process 22671 attached - interrupt to quit
Process 22671 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
80.41 1.921345 58 32953 select
9.52 0.227385 9 25324 sendto
6.70 0.160188 6 25324 recvfrom
1.61 0.038422 93 414 414 rt_sigsuspend
1.23 0.029284 4 7600 gettimeofday
0.24 0.005694 7 838 kill
0.11 0.002519 9 285 write
0.08 0.001927 5 414 414 sigreturn
0.07 0.001712 4 414 rt_sigprocmask
0.02 0.000432 24 18 send
0.01 0.000216 6 36 recv
0.00 0.000099 33 3 shutdown
0.00 0.000057 10 6 time
0.00 0.000041 21 2 socket
0.00 0.000036 12 3 close
0.00 0.000028 14 2 listen
0.00 0.000014 7 2 bind
0.00 0.000009 5 2 getsockname
0.00 0.000009 5 2 setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00 2.389417 93642 828 total









-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

_______________________________________________________

List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Archive: http://sourceforge.net/mailarchive/forum.php?forum_id=8549
Homepage: http://www.gnugk.org/

[Index of Archives]     [SIP]     [Open H.323]     [Gnu Gatekeeper]     [Asterisk PBX]     [ISDN Cause Codes]     [Yosemite News]

  Powered by Linux