On 13.12.2019 05:43, Luboš Luňák wrote:
On Friday 13 of December 2019, Kohei Yoshida wrote:
I just finished my benchmark testing on mdds::multi_type_vector, and
summarized my results in this blog post:
http://kohei.us/2019/12/12/benchmark-results-on-mdds-multi_type_vector/
Hopefully my findings and intepretations make sense. In short, the
numbers look great. The overhead of block shifting is a concern, but
I'm optimistic that this is going to be a non-issue for the most part.
I'd really like to see benchmarks of Calc with this new mdds,
especially to
see how many regressions there will be, as I'm concerned whether it
really
would be worth it in reality.
Sure, I do share your concern, which is why I spent time designing and
implementing the benchmark I did so that I can get some answers for my
concern.
You say that the vast majority of Calc
performance problems are with updating cell values without shifting,
but that
makes sense because that's where the current bottleneck is. Once the
bottleneck moves to shifting of cells, we may get a whole new slew of
bugreports about that.
Sure, but that's just as much of a speculation as my own interpretation.
To be fair, it is possible that you are right, and I am wrong. But I
did provide my own interpretations of those numbers based on my own
experience and educated guesses. I'm not claiming that I'm right, but
I'm claiming that what I concluded in my post is my truly honest,
hopefully reasonably researched opinions.
E.g. copy&paste of a column is very likely to hit a
problem there, IIRC it internally results in a lot of shifting of
cells.
Yes, which is why I ran the benchmarks to get some numbers to get more
clarity.
One interpretation of the graphs may be that the change helps a lot at
the
cost of a regression in one place, but other possible interpretation is
that
the change brings an improvement that can already be mostly achieved
using
hints at the expense of a cost that cannot be alleviated. Moreover we
did go
over all the reported performance problems related to mdds some months
back
and fixed all of them (at least I'm not aware of any pending ones). So
the
real question for me is how many of real-world cases will be improved
and
worsened by this, which is why I'd like to see non-artifical
benchmarks.
So, I'm a bit concerned about your use of the word "artificial" to
describe my benchmark, because that word implies that I somehow made
those numbers up. Those are real numbers. Now, the numbers will of
course be quite different if you measure the entire Calc operations
which include a whole bunch of other operations, and I believe this is
what you are alluding to. I do share your concern there. But I thought
it was reasonable to draw the conclusions that I did, given that the I/O
with mdds::multi_type_vector do constitute a large part of Calc's cell
I/O's. Also, keep in mind that the rest of the Calc operations are
constant, and the only variable is the mdds portion. On this point, I
believe it's not unreasonable to draw *some* conclusions based on the
numbers on mdds alone.
Having said that, you are of course free to draw your own, different
conclusions.
BTW, I have you considered using vector operations like SSE for the
updates
(either checking whether the compiler can employ them automatically or
hand-writing them)?
Yes. For one, I did look into e.g. OpenMP's auto SIMD support. But its
support appeared to be very limited, and MSVC did not seem to support
it. I also thought about hand-writing SIMD directly, and I am still
considering that as one of my future possibilities (note that I'm not
entirely done with this work). But I couldn't think of a good one to
use, especially when multi_type_vector uses array of structures (AoS).
SIMD intrinsics I know of are mostly not suitable for AoS. If you know
of good SIMD instinsics that may work for multi_type_vector, I would be
interested.
I've done some SIMD coding in orcus to speed up XML and JSON parsing,
but I can't say I'm expert at it, and I did not always manage to get the
code to run faster with SIMD.
Alright, since now one person is raising objection on hastily
integrating this piece, I should hold on to integrating this piece for
now, and let the discussion continue.
And, while I would love to craft another benchmark test involving the
entire Calc piece, I'm afraid I won't have enough bandwidth to do that.
Even running this benchmark on mdds alone took me one month to do it
end-to-end. It would be nice to have someone else chip in and conduct
another, more through and satisfactory benchmark test, if anybody is
interested.
Thanks,
Kohei
--
Kohei Yoshida, LibreOffice Calc volunteer hacker
_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice