Re: [dispatch] SIP-CLF: Results on ASCII vs. binary representation

"Vijay K. Gurbani" <vkg@xxxxxxxxxxxxxxxxxx> · Wed, 29 Apr 2009 13:12:28 -0500

Theo Zourzouvillys wrote:
actually, your test program is *grossly* skewed in favour of the ASCII
implementation.  If you modify it slightly to behave in a way i'd
expect any developer to, you get (avg 5 runs on a crappy dell vostro
desktop):

 Binary CLF:   0m6.947s
 ASCII CLF:    0m7.004s
[...]

Theo: True, using scatter-gather (s-g) writes you can optimize the
binary I/O.

By the same token, I can optimize the ASCII writes a bit using
s-g writes; for instance, I was able to bring the average down
by 1.02s for ASCII CLF using s-g writes.

But we intentionally stayed away from s-g writes for the
following four reasons:

1) On some systems, the value of IOV_MAX is set to a low number.
  For example, in Solaris 8 the value of IOV_MAX is set to 16,
  forcing you to do multiple s-g I/O calls (thereby negating
  some of the optimization effects.)

2) Some systems have a maximum ceiling on how many bytes can
  be transferred in one writev(), i.e., the sum of all iov_len
  members of the iov array should be less than a certain
  system-defined maximum.

    [Note: I seem to remember that a few years ago, the
    Apache lists were full of problems related to 1 and 2.]

3) Portability: I was not sure how portable the writev() system
  call would be on all kinds of operating environments.  Since
  the SIP CLF is designed for all SIP entities, if a (real-time)
  operating system does not have the writev() system call, one
  is forced to used the non-optimized method anyway.  Furthermore,
  in a RT OS, the limits for 1 and 2 will be much lower, if
  writev() is provided at all.

4) We did not necessarily want to make any assumptions about
  how implementations have created data structures to hold
  the results of parsing (i.e., some implementations may very
  well use struct's to store the text and length for each
  SIP token, while others may simply store the text and
  compute the length when needed, etc.)  For the sake of
  demonstration, our program implements the first option
  (i.e., uses struct's to store the text and length), which
  is actually more conducive to the s-g approach, but by no
  means is this the only way to design your data structures.

(1) is a real concern because as you can well imagine that URIs,
once parsed, can be composed of many different objects (or
structs in C.)  As such, the representation of a composed URI
in a iov structure will require multiple indexes.

Hence, we wanted to use the most common denominator to do the
measurements -- in our initial performance data, there is no
optimization for either the binary CLF case or the ASCII CLF case.

note that i wrote it in all of about 120 seconds, so there may be some
errors in the output format, but my point stands :-)

There appear to be since I cannot read the last record; but I
have not had the chance to look at the output format from your
program in any detail.

Thanks,

- vijay
--
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{alcatel-lucent.com,bell-labs.com,acm.org}
Web:   http://ect.bell-labs.com/who/vkg/
_______________________________________________
Sipping mailing list  https://www.ietf.org/mailman/listinfo/sipping
This list is for NEW development of the application of SIP
Use sip-implementors@xxxxxxxxxxxxxxx for questions on current sip
Use sip@xxxxxxxx for new developments of core SIP