Search Postgresql Archives

Re: Query results caching?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 22, 2005 at 10:13:49PM +0200, Ben-Nes Yonatan wrote:

I think that I was misunderstood, Ill make an example:
Lets say that im making the following query for the first time on the "motorcycles" table which got an index on the "manufacturer" field:

EXPLAIN ANALYZE SELECT manufacturer FROM motorcycles WHERE manufacturer='suzuki';
... Total runtime: 3139.587 ms

Now im doing the same query again and i get a much faster result (cause of the "caching"): Total runtime: 332.53 ms

After both of those queries I drop the index and query the table again with the exact same query as before and now I receive: Total runtime: 216834.871 ms

And for my last check I run the exact same query again (without creating the INDEX back again) and I get quite similar result to my third query: Total runtime: 209218.01 ms


My problem is that (maybe I just dont understand something basic here...) the last 2 (also the second query but I dont care about that) queries were using the "cache" that was created after the first query (which had an INDEX) so none of them actually showed me what will happen if a client will do such a search (without an INDEX) for the first time.

I want to delete that "caching" after I do the first 2 queries so my next queries will show me "real life results".



Ok I tried to handle both of your replies cause I got them at 2 seperate emails.

Dann Corbit wrote:
> These results are all what I would expect.  When you delete the index,
> the query will be forced to do a table scan (to examine every single
> record in the table one by one).  If the table is non-trivial it is
> unlikely that either the OS or the database will cache the whole thing
> in memory.  However, when you query a small record set, then it is
> likely to be retained in RAM which is literally thousands of times
> faster than disk.

Didnt know that, good to know though doesnt assure me...
What if I drop the INDEX but create a diffrent INDEX which also make the process alot faster then without an INDEX but slower/faster then the one before, will it wont use the former "caching"?

> If a search is to be made on a frequent basis, you should create an
> index.
> The query results above show you why.

Obvious :)

> Think about this for a minute.  The real life results you want are
> very fast results.  For that reason, you should try to model the
> customer queries as nearly as possible.  If you have a canned
> application like order entry, then the real parameterized query set
> will probably be quite small in real life.  If you are creating a
> server for ad-hoc queries then it will be far more difficult to model > in real life.
>
> What is the real purpose of the application that you are writing?
>
> Will users be using a pre-programmed front end, or will they be typing
> in queries free-form for whatever their heart desires?

Ok ill try to describe the system as short & precise as possible (its also passed midnight here :)). Each day I receive about 4 million rows of data (products) which I insert into table1 (after I delete all of the previous data it had), along it I receive for every row about another 15 keywords which I insert into table2 (where as in table1 I delete all of the previous data it had also), this process is a fact that I cant change. Now the users of the site can search for data from table1 by typing whichever (and up to 4) words as they want at a text field (search input string) and the server should display the correct results by querying table1 & join table2 for its keywords. I succeded to do it quite fast but when I tried to ORDER BY my results its times jumped up drastically (2-3 seconds for a query... and thats after the caching..). I can't allow a situation where a user will search with a keyword which wasnt 'cached' before and because of that he will wait 15 seconds for a result.

Jim C. Nasby wrote:
Emptying the cache will not show real-life results. You are always going
to have some stuff cached, even if you get a query for something new. In
this case (since you'll obviously want those indexes there), after some
amount of time you will have most (if not all) of the non-leaf index
pages cached, since they take a fairly small amount of memory and are
frequently accessed. This makes index traversal *much* faster than your
initial case shows, even if you query on something different each time.
Testing with a completely empty cache just isn't that realistic.

As far as I understand it at my situation where all of the data is deleted and inserted each day from the start (INDEX will get lost with it..) & the endless variety of possible keywords search's & the immense size of the tables, the following reason wont last.. or am I wrong here?



Because of all of that I want to be able to see how much time a query takes when its the first time its being run..... or I'm wrong again and failing to understand something?

Again everyone THANKS ALOT its really amazing the help that I receive from you!
Ben-Nes Yonatan

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux