Search Postgresql Archives

Re: query time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



WireSpot wrote:
On Wed, 02 Feb 2005 14:48:41 +0000, Richard Huxton <dev@xxxxxxxxxxxx> wrote:

Think about it, you'd need an index that ordered use_name so that
(john_doe, AAAAAjohn_doe, Zjohn_doe1234) were all next to each other.

If you anchor the search (LIKE 'john_doe%') and are using the C locale
then an index can be used (because it's the same as >='john_doe' AND
<'john_dof').


Unfortunately, all my cases are LIKE '%john_doe'. So I'm guessing I'll
never get an index.

Well, if you want names *ending* in john_doe you can create a function reverse() and then create a functional index on it. If you want something in the middle, tough.


How about the response time when doing both LIKE user_name and ORDER
BY timestamp? Why does it get blown out of the water like that, from 5
and 5 to 2000 ms? If a LIKE by itself takes 5 ms and an ORDER by
itself takes 5 ms... Doesn't it grab the results matching the LIKE and
the ORDER only those?

In your first '%john_doe%' example you had a LIMIT 100 without a sort. So - the sequential scan started and when PG found the 100th match it stopped. If you order by timestamp, it has to find all the matches, sort them by timestamp and then discard the 101st onwards.


Now, the reason it doesn't use the timestamp index instead is that it thinks that your LIKE is going to be very specific:
(cost=0.00..8250.17 rows=1 width=175)
In actual fact, there isn't one matching row there are 15,481:
(actual time=0.432..1051.036 rows=15481 loops=1)
I'm not sure that there's any way for PG to know how many matches there are going to be - it keeps track of values, but there's no easy way to model substrings of a column.


Are you sure you need to match user_name against a double-wildcarded value? If you do, all I could suggest is perhaps limiting the search to one day/week/whatever at a time, which might make the timestamp index seem appealing.

While we're on the subject of indexes, is there any way I can speed up
a SELECT DISTINCT user_name FROM log_entries? With 200.000 entries I
get like 46 seconds on this one.

Not sure there's a simple way to avoid a seq-scan of the whole table. PG has good concurrency support with MVCC, but the downside is that the indexes don't store whether something is visible or not. That means you can't just look at the index to determine what values are visible to your current transaction.


> I apologise if these things are already in the manual, I'm only now
> getting used to it and I don't find some things right away.

No problem - all sensible questions. Worth checking the mailing-list archives too though.

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux