Re: Why are RB-trees used to implement std::set?

Martin Sebor <msebor@xxxxxxxxxx> · Mon, 11 May 2015 08:54:52 -0600

On 05/11/2015 05:41 AM, Jakub Arnold wrote:
Hi,

I'm curious as to why libstdc++ is using a RB-tree to implement
std::set (details here
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/set
and here https://github.com/gcc-mirror/gcc/blob/master/libstdc++-v3/include/bits/stl_tree.h),
when there are faster alternatives?

Because the original HP STL that most implementations (including
libstdc++) are derived from was written that way. Changing the
underlying data structure would likely break binary compatibility
and so the benefits of such a change would have to significantly
outweigh its costs.

Martin

I'm mainly curious, because from all the people I've asked there
hasn't been a single answer in favor of RB-trees, other than "they're
already popular" or "easy to implement".

If you'd like more details on that, here's a link to my question on
stackexchange http://cs.stackexchange.com/questions/41969/why-are-red-black-trees-so-popular,
where nobody has yet answered as well.

TL;DR: Using a variant of B-tree (or (a,b)-tree) should be faster, and
everyone seems to suggest so, which makes me wonder what is the reason
for picking RB-trees? This is not a question about their theoretical
speed, but about real world behavior on real hardware with respect to
CPU caches, etc.

--
Jakub