Hi Stephan,
On 1/30/19 10:40 PM, Stephan Bergmann wrote:
On 30/01/2019 22:17, Matteo Casalin wrote:
I'm working on improving code that calls getToken (e.g. using its
version with index, or using other OUString functions in its place
when possible).
One thing that I noticed is that there are a lot of calls in the form
getToken().toInt# which require memory management just to obtain a
value that could be generated by the original OUString. Similarly (but
less frequently), some tokens are extracted just to compare them
against a string, which again requires memory management that is
really not needed.
I was wondering if extending O(U)String with functions like:
* getTokenAs[U]Int#(token, sep, index)
* matchToken(token, sep, index, string)
would be accepted/appreciated or not. At the moment I already
submitted to gerrit a patch [1] which adds
comphelper::string::matchToken but I think that adding such
functionality to OUString directly would be nicer. Also, introducing
getTokenAsInt in OUString would likely allow to reuse its toInt code.
Sounds a bit too special-purpose to be worth adding, IMO. Would those
optimizations really make a measurable difference?
I don't have real numbers to provide, but a very rough check on getToken
provides the following numbers:
git grep -w getToken > getToken.txt
grep -wc getToken getToken.txt ==> 1646
grep -wc toInt32 getToken.txt ==> 218
grep -wc toInt64 getToken.txt ==> 8
grep -wc toUInt32 getToken.txt ==> 0
grep -wc toUInt64 getToken.txt ==> 8
The number of getToken occurrences is higher that real
OUString::getToken calls (comments, header files, definitions and also
not OUString getToken), and I am missing places in which conversion to
integer is done in a following line. As a result we have that this
pattern is > 14.2% of all getToken occurrences. I cannot say if this is
frequently called code or not.
About matchToken, this seems to be a very less frequent pattern and at
the moment the comphelper approach can provide a viable approach, so I
woulg go this way (and will take care of reviewing some older getToken
optimizations that I implemented).
Also, a better approach overall would probably be some string_view-based
getToken functionality (converting from an OUString to a string_view is
cheap), and then string_view-based toInt etc. functions.
At the moment I plan to just go through all of getToken uses and do some
minor local optimizations, then I might have a look at the string_view
approach (unless previous numbers make the OUString one look not too
specialised).
Many thanks for your comments
Kind regards
Matteo
_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice
_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice