Re: parallel index creation: maintenance_work_mem not honored?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 25, 2022 at 9:37 AM MichaelDBA <MichaelDBA@xxxxxxxxxxx> wrote:
> If "External sorts are often faster than internal sorts", why bother with increasing work_mem to avoid sorting on disk?

They're faster despite spilling to disk -- not because of it. This
effect is quite visible at times, though usually only with large
sorts. It's noticeable, but not huge. And it depends on things like
the datatypes being sorted, to a degree. (My original point was that
external sorts shouldn't be assumed to be significantly slower or
significantly less efficient than equivalent large internal sorts,
where CPU costs tend to dominate either way.)

Only an external sort can do a final on-the-fly merge, where tuplesort
outputs tuples in their final sorted order before the entire input has
been fully sorted. It's not uncommon for that to more than make up for
any latency from writing out temp files. The added latency is usually
not noticeable because we also need to read from random locations in
memory to gather tuples to output to temp files. That can totally hide
any I/O latency in many important cases.

In principle you could get the same benefit with an internal sort,
without any I/O for temp files -- nothing makes it fundamentally
impossible to teach tuplesort how to do a similar final on-the-fly
strategy. But that's tricky for a variety of reasons, mostly related
to the fact that the final size of the input cannot easily be
predicted. That's why it hasn't been implemented.

-- 
Peter Geoghegan





[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux