RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs

"Godfrin, Philippe E" <Philippe.Godfrin@xxxxxxx> · Fri, 10 Dec 2021 18:04:07 +0000

On 2021-12-08 14:44:47 -0500, David Gauthier wrote:

> So far, the tables I have in my DB have relatively low numbers of records (most

> are < 10K, all are < 10M).  Things have been running great in terms of

> performance.  But a project is being brainstormed which may require some tables

> to contain a couple billion records.

[...]

> What else should I be worried about ?

> 

> I suspect that part of why things are running really well so far is that the

> relatively small amounts of data in these tables ends up in the DB cache and

> disk I/O is kept at a minimum.  Will that no longer be the case once queries

> start running on these big tables ?

>
>Depends a lot on how good the locality of your queries is. If most read
>only the same parts of the same indexes, those will still be in the
>cache. If they are all over the place or if you have queries which need
>to read large parts of your tables, cache misses will make your
>performance a lot less predictable, yes. That stuff is also hard to
>test, because when you are testing a query twice in a row, the second
>time it will likely hit the cache and be quite fast.
> 
>But in my experience the biggest problem with large tables are unstable
>execution plans - for most of the parameters the optimizer will choose
>to use an index, but for some it will erroneously think that a full
>table scan is faster. That can lead to a situation where a query
>normally takes less than a second, but sometimes (seemingly at random)
>it takes several minutes - users will understandably be upset about such
>behaviour. It is in any case a good idea to monitor execution times to
>find such problems (ideally before users complain), but each needs to be
>treated on an individual basis, and sometimes there seems to be no good
>solution.

To the OP, that’s is a tall order to answer – basically that’s wjhy DBA’s still have
Jobs…

For Peter I have a question. What exactly causes ‘unstable execution plans’ ??

Besides not using bind variables, bad statistics, would you elaborate in what would

contribute to that instability?