Re: Hints (was Poor performance using CTE)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Craig Ringer wrote:
> On 11/21/2012 09:35 AM, Craig James wrote:
>> Why not make an explicit hint syntax and document it? I've still
>> don't understand why "hint" is a dirty word in Postgres. There are
>> a half-dozen or so ways in common use to circumvent or correct
>> sub-optimal plans.
> 
> The reason usually given is that hints provide easy workarounds for
> planner and stats issues, so people don't report problems or fix
> the underlying problem.
> 
> Of course, if that's all there was to it, `OFFSET 0` would be made
> into an error or warning, or ignored and not fenced.
> 
> The reality is, as you say, that there's a need, because the
> planner can never be perfect - or rather, if it were nearly
> perfect, it'd take so long to read the stats and calculate plans
> that everything would be glacially slow anyway. The planner has to
> compromise, and so cases will always arise where it needs a little
> help.
> 
> I think it's time to admit that and get the syntax in place for
> CTEs so there's room to optimize them later, rather than cementing
> CTEs-as-fences in forever as a Pg quirk.

It's a tough problem. Disguising and not documenting the available
optimizer hints leads to more reports on where the optimizer should
be smarter, and has spurred optimizer improvements. And many type of
hints would undoubtedly cause people to force what they *think* would
be the best plan in many cases where they are wrong, or become wrong
as data scales up. But it does seem odd every time I hear people
saying that they don't want to eliminate some optimization fence
because "they find it useful" while simultaneously arguing that we
don't have or want hints.

Having a way to coerce the optimizer from the plan it would take with
straightforward coding *is* a hint, and one down-side of hiding the
hints inside syntax mostly supported for other reasons is that people
who don't know about these clever devices can't do reasonable
refactoring of queries for readability without risking performance
regressions. Another down-side is that perfectly reasonable queries
ported from other databases that use hint syntax for hints run afoul
of the secret hints when trying to run queries on PostgreSQL, and get
performance potentially orders of magnitude worse than they expect.

I'm not sure what the best answer is, but as long as we have hints,
but only through OFFSET 0 or CTE usage, that should be documented.
Better, IMV, would be to identify what sorts of hints people actually
find useful, and use that as the basis for TODO items for optimizer
improvement as well as inventing clear ways to specify the desired
coercion. I liked the suggestion that a CTE which didn't need to be
materialized because of side-effects or multiple references have a
keyword. Personally, I think that AS MATERIALIZED x (SELECT ...)
would be preferable to AS x (SELECT ... OFFSET 0) as the syntax to
specify that.

Regarding the above-mentioned benefits we would stand to lose by
having clear and documented hints, perhaps we could occasionally
solicit input on where people are finding hints useful to get ideas
on where we might want to improve the optimizer. As far as worrying
about people using hints to force a plan which is sub-optimal --
isn't that getting into nanny mode a bit too much?

-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux