A little post script to this discussion: I wrote a few small test
programs in C to evaluate the performance of reading integers from
a text file using <stdio.h> versus doing the same with direct read()
s from a binary file. The difference is between two and three
orders of magnitude. See http://ablog.apress.com/?p=1146
Iljitsch,
in your original question to the list, you didn't quite make clear
that your question was with respect to BGP-style transfer of large-
scale routing information.
Right now, you seem to focus on decoding performance. How much of
the CPU time spent for BGP is decoding?
Does the CPU time spent for the entirety of BGP even matter*? If
yes, can a good data structure/encoding help with the *overall* problem?
The results from your test programs are not at all surprising.
Of course, a hand-coded loop where all data already is in the right
form (data type, byte order, number of bits), no decisions need to be
made, and you even know the number of data items beforehand, is going
to be faster than calling the generic, pretty much neglected,
parameterized, tired library routine fscanf that doesn't get much use
outside textbooks.
(The "read" anomaly is caused by read(2) being an expensive system
call; all other cases use a form of buffering to reduce the number of
system calls.)
What this example shows nicely is that performance issues are non-
trivial, and, yes, you do want to run measurements, but at the system
level and not at the level of "test cases" that have little or no
relationship to the performance of the real system.
If you really care about the performance of text-based protocols, you
cannot ignore modern tools like Ragel.
If, having used them, you still manage to find the text processing
overhead in your profiling data, I'd like to hear from you.
Still, for BGP, a binary protocol encoding may be a better fit
because routing tables are so much about bits and prefixes and other
numeric information already designed to be used in binary protocol
encodings.
Also, it may be easier to reduce both data rate and processing by
exploiting more of the structure of the BGP routing information.
(I.e., to make it redundantly clear, I would probably choose binary
here, but not for the reasons given in your blog post.)
Gruesse, Carsten
*) Yes, that's a trick question to elicit responses :-)
_______________________________________________
Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf