Search Postgresql Archives

Re: POSTGRES DB 3 800 000 rows table, speed up?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could you explain this a little bit more?
What are the conditions of this situation that makes b-tree ineffective?

Thanks
SWK

-----Original Message-----
From: pgsql-general-owner@xxxxxxxxxxxxxx
[mailto:pgsql-general-owner@xxxxxxxxxxxxxx] On Behalf Of Tom Lane
Sent: 2005. december 28. 20:04
To: Jim C. Nasby
Cc: Eugene; pgsql-general@xxxxxxxxxxxxxx
Subject: Re: [GENERAL] POSTGRES DB 3 800 000 rows table, speed up? 

"Jim C. Nasby" <jnasby@xxxxxxxxxxxxx> writes:
> On Tue, Dec 27, 2005 at 11:25:37PM +0200, Eugene wrote:
>> I ask db like this  SELECT * FROM ipdb2 WHERE '3229285376' BETWEEN ipfrom

>> AND ipto;

> I'm pretty sure PostgreSQL won't be able to use any indexes for this
> (EXPLAIN ANALYZE would verify that). Instead, expand the between out:

> WHERE ipfrom >= '...' AND ipto <= '...'

That won't help (it is in fact exactly the same query, because BETWEEN
is just rewritten into that).  The real problem is that btree indexes
are ill-suited to this type of condition.  If the typical row has only
a small distance between ipfrom and ipto then the query is actually
pretty selective, but there is no way to capture that selectivity in
a btree search, because neither of the single-column comparisons are
selective at all.  The planner realizes this and doesn't bother with
the index, instead it just does a seqscan.

You could probably get somewhere by casting the problem as an rtree
or GIST overlap/containment query, but with the currently available
tools it would be a pretty unnatural-looking query ... probably
something like
	box(point(ipfrom,ipfrom),point(ipto,ipto)) ~
	box(point(3229285376,3229285376),point(3229285376,3229285376))
after creating an rtree or GIST index on
	box(point(ipfrom,ipfrom),point(ipto,ipto))
(haven't tried this but there is a solution lurking somewhere in this
general vicinity).

Is there a good reason why the data is stored this way, and not as
say a single "cidr" column containing subnet addresses?  Querying
	WHERE '192.122.252.0' << cidrcolumn
would be a much more transparent way of expressing your problem.
We don't currently have an easy indexing solution for that one either,
but we might in the future.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux