On mar, sep 04, 2007 at 09:38:57 +0000, Alex Riesen wrote: > Lukas Sandström, Tue, Sep 04, 2007 22:50:08 +0200: > > Hi. > > > > This is an attempt to use "The Better String Library"[1] in builtin-mailinfo.c > > > > The patch doesn't pass all the tests in the testsuit yet, but I thought I'd > > send it out so people can decide if they like how the code looks. > > It looks uglier, but what are measurable merits? Object code size, > perfomance hit/improvement, valgrind logs? Well I honestly believe that putting strbufs/bstrings in mailinfo.c adds no value. I was going to give it a try to see how strbufs performed, but it's just useless. The main problem mailinfo has, it's according to Junio that it may sometimes truncate some things in buffers at 1000 octets, without dying loudly. That is bad. _but_ there is no point in using arbitrary long string buffers to parse a mail. Remember, a mail goes through SMTP, and SMTP is supposed to limit its lines at 512 characters (without use of extensions at least). Not to mention that an email address cannot be more than 64+256 chars long (or sth around that). So using variable lengths buffers is just a waste. string buffers are not really (IMHO) supposed to help in parsing tasks, and when you need to do some serious parsing, either do it by hand or use lex, but nothing in between makes sense to me. OTOH, string buffers can be used in many places where git has (at least 4 different to my current count, growing) many implementations of always slightly different kind of buffers. I've some more patches pending here than the one I already sent, and well, here is the diffstat: $ git diff --stat origin/master.. ^strbuf* archive-tar.c | 67 ++++++++++++------------------------------------ builtin-apply.c | 29 ++++++--------------- builtin-blame.c | 34 ++++++++----------------- builtin-commit-tree.c | 59 +++++++++--------------------------------- builtin-rerere.c | 53 +++++++++++--------------------------- cache-tree.c | 57 ++++++++++++++--------------------------- diff.c | 25 ++++++------------ fast-import.c | 38 +++++++++++---------------- mktree.c | 26 ++++++------------- 9 files changed, 116 insertions(+), 272 deletions(-) I mean, there is not even a need to show the diff to understand what the gain is. And that was possible, because strbufs are straightforward, and gives you the kind of controls git needs (tweaking how memory will be allocated to avoid reallocs is part of the answer). A French author once said: “Il semble que la perfection soit atteinte non quand il n'y a plus rien à ajouter, mais quand il n'y a plus rien à retrancher.” -- Antoine de St Éxupéry[0]. IMHO git will never need any of the bstring splits, streaming functions, tokenization or whatever, and supporting those has necessarily led the bstring library to make some choices that may not fit git needs. I don't really like reinventing the wheel, but OTOH buffers and strings are often of the critical path, and having a nice fitting buffer API is priceless. [0] Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. -- ·O· Pierre Habouzit ··O madcoder@xxxxxxxxxx OOO http://www.madism.org
Attachment:
pgpviAFA9pQCZ.pgp
Description: PGP signature