On Wed, Nov 16, 2005 at 07:48:06PM -0500, Tom Lane wrote: > Bill Moseley <moseley@xxxxxxxx> writes: > > [ strange behavior ] > > Oh, duh, it's not a PG bug: the problem is that the view is > underspecified. You have > > SELECT DISTINCT ON (class.id) > ... a bunch of stuff ... > FROM ... a bunch of tables ... > ORDER BY class.id; > > The difficulty with this is that DISTINCT ON will take the first row in > each group with the same class.id. And since you're only sorting by > class.id, "the first row" is ill-defined. Sorry, but I fear I'm missing something. That ORDER BY is added by PG -- it's not part of my view when I define it. I assume PG adds that so it can do the DISTINCT ON. Still, I don't have any duplicate class.id rows in this select that I can see. class.id 1243 and 1244 are not the same, yet PG is (sometimes) throwing out one of them. Are you saying that somehow PG thinks they are the same class.id and is thus removing one? I'm asking for a list of all classes taught by instructor 84. ws2=> select * from instructors where person = 84 order by class; person | class --------+------- 84 | 727 84 | 739 84 | 804 84 | 813 84 | 867 84 | 1243 84 | 1244 (7 rows) The reason I'm using DISTINCT ON is because the "class_list" view is suppose to just return a list of unique classes, and a class might have more than one instructor which would result in extra rows -- as shown here: ws2=> select * from instructors where class in (select class from instructors where person = 84); person | class --------+------- 84 | 727 84 | 739 84 | 804 84 | 813 84 | 867 84 | 1243 26 | 1243 84 | 1244 26 | 1244 (9 rows) So when I don't want duplicates: ws2=> select distinct on (class) * from instructors where class in (select class from instructors where person = 84); person | class --------+------- 84 | 727 84 | 739 84 | 804 84 | 813 84 | 867 84 | 1243 84 | 1244 (7 rows) I don't care in this case about the DISTINCT ON throwing out the duplicates -- I just care about distinct classes, not that all the instructors are included in this select. And even if I throw in all my other joins I get the same thing: ws2=> SELECT DISTINCT ON (class.id) ws2-> class.id AS id ws2-> ws2-> FROM class, location, region, person, instructors ws2-> ws2-> WHERE class.location = location.id -- join with location ws2-> AND class.id = instructors.class -- join the instructors ws2-> AND instructors.person = person.id -- join the person(s) ws2-> AND location.region = region.id -- join the location to a region ws2-> AND person.id = 84; id ------ 727 739 804 813 867 1243 1244 (7 rows) > I'm not sure why qsort's > behavior seems to depend on the width of the rows, but there's no doubt > that it's sorting different rows to the front of each group depending > on which view you use. I just don't see what groups there are, though in this case. > It could also be that you don't want to be using DISTINCT ON at all; > have you thought through exactly what this view ought to produce for > each class.id? Yes, I think so. A list of columns related to it, with the exception of when there's duplicate instructors I want one of those duplicates thrown out (and I don't care which one). When I do a query that generates duplicate class.id's such as when a class has more than one instructor: ws2=> select class.id AS class_id, ws2-> person.id AS person_id ws2-> ws2-> FROM class, location, region, person, instructors ws2-> ws2-> WHERE class.location = location.id -- join with location ws2-> AND class.id = instructors.class -- join the instructors ws2-> AND instructors.person = person.id -- join the person(s) ws2-> AND location.region = region.id -- join the location to a region ws2-> AND class_time > now(); class_id | person_id ----------+----------- 561 | 95 614 | 95 747 | 111 762 | 111 772 | 111 883 | 13 924 | 26 935 | 26 945 | 26 1243 | 84 1243 | 26 1244 | 84 1244 | 26 (13 rows) You can see some classes are listed twice, so using distinct on gets just my list of unique classes: ws2=> SELECT DISTINCT ON (class.id) ws2-> class.id AS class_id, ws2-> person.id AS person_id ws2-> ws2-> FROM class, location, region, person, instructors ws2-> ws2-> WHERE class.location = location.id -- join with location ws2-> AND class.id = instructors.class -- join the instructors ws2-> AND instructors.person = person.id -- join the person(s) ws2-> AND location.region = region.id -- join the location to a region ws2-> AND class_time > now(); class_id | person_id ----------+----------- 561 | 95 614 | 95 747 | 111 762 | 111 772 | 111 883 | 13 924 | 26 935 | 26 945 | 26 1243 | 84 1244 | 84 (11 rows) All the view does is include more columns in the result set, and that seems to change the number of results. Here's the view doing the previous query based on "class_time": ws2=> select id, person_id from class_list where class_time > now(); id | person_id ------+----------- 561 | 95 614 | 95 747 | 111 762 | 111 772 | 111 883 | 13 924 | 26 935 | 26 945 | 26 1243 | 26 << PG selected the other instructor this time 1244 | 84 (11 rows) FYI - For those following along, none of those queries show the actual problem I'm having with the view. Which is the view of the above join is returning the wrong number of rows -- 1243 is missing. Again, selecting by person_id: ws2=> select id from class_list where person_id = 84; id ------ 727 739 804 813 867 1244 (6 rows) I have a feeling this is going to be a "doh!" when I finally see it... -- Bill Moseley moseley@xxxxxxxx ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq