On Sat, Aug 27, 2016 at 7:13 AM, Craig James <cjames@xxxxxxxxxxxxxx> wrote:
On Fri, Aug 26, 2016 at 9:11 PM, Jim Nasby <Jim.Nasby@xxxxxxxxxxxxxx> wrote:On 8/26/16 3:26 PM, Mike Sofen wrote:
Is there way to keep query time constant as the database size grows.
No. More data == more time. Unless you find a way to break the laws of physics.Straight hash-table indexes (which Postgres doesn't use) have O(1) access time.
But he isn't doing single-row lookups, he is doing large aggregations. If you have to aggregate N rows, doing a O(1) operation on different N occasions is still O(N).
Not that big-O is useful here anyway. It assumes that either everything fits in RAM (and is already there), or that nothing fits in RAM and it all has to be fetched from disk, even the index root pages, every time it is needed. Tommi is not operating under an environment where the first assumption holds, and no one operates in an environment where the second assumption holds.
As N increases beyond available RAM, your actual time for a single look-up is going to be a weighted average of two different constant-time operations, one with a small constant and one with a large constant. Big-O notation ignores this nicety and assumes all operations are at the slower speed, because that is what the limit of the weighted average will be as N gets very large. But real world systems do not operate at the infinite limit.
So his run time could easily be proportional to N^2, if he aggregates more rows and each one of them is less likely to be a cache hit.
Cheers,
Jeff