Search Postgresql Archives

Re: join with redundant results VS simpler join plus multiple selects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



WireSpot wrote:
I have a schema with galleries, people and images. Each person has a
bunch of private images. People can join any number of galleries and
can publish any of their images to the galleries they join (or not).

I'd like to retrieve a data set where for a given gallery id I get all
the people AND all the images they've published to that gallery.

I can do this in two ways.
1) Do a join that will give me the people that belong to said gallery,
then loop in the code and do simple selects to retrieve images in that
gallery for each of them.
2) Do a join between all three tables. The end result will have as
many rows as total images for all the people in the gallery.
Obviously, there's going to be redundant data, since a person's info
will be repeated for each image.

Which is better in terms of performance? I used EXPLAIN ANALYZE and
actual queries and it seems to suggest that option 2, while returning
redundant info, is faster.

That's probably going to be the case. PostgreSQL won't need to read the redundant info in from disk each time, and relative to the image data it's going to be pretty small. By doing it all in one join you're avoiding the overhead of all those network round trips (if on a network), statement preparation and planning, etc etc etc. Additionally, PostgreSQL is probably going to be using a join plan that's much more efficient than anything you'll get by looping over each user and asking for images.

If you wanted to avoid returning too much redundant info, you could always do it in two queries:

- Find a list of all users belonging to the gallery and any other non-image data associated with them; then

- Retrieve all images in one query using a join against the list of users who're members of the gallery, but only actually return (eg) the user id, gallery id, and image data for each image.

In all honestly, though, it probably doesn't matter unless there's a LOT of additional data you want to obtain about each user.

Note, however, that when testing method (1) in your post you will REALLY need to make sure that you're using parameterized prepared statements for the image queries.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux