Re: Efficient sorting the results of a join, without denormalization

"David G. Johnston" <david.g.johnston@xxxxxxxxx> · Sat, 30 May 2015 21:33:13 -0700

On Saturday, May 30, 2015, Glen M. Witherington <glen@xxxxxx> wrote:
Sorry about the horrendous subject, let me explain by example:

Let's take this schema:

```

CREATE TABLE a (

  id bigserial PRIMARY KEY,

  created_at timestamp with time zone  NOT NULL  DEFAULT NOW()

);

CREATE TABLE b(

  id bigserial PRIMARY KEY,

  a_id bigint NOT NULL REFERENCES a(id),

  created_at      timestamp with time zone  NOT NULL  DEFAULT NOW()

);

CREATE TABLE c(

  id bigserial PRIMARY KEY,

  b_id bigint NOT NULL REFERENCES b(id),

  created_at      timestamp with time zone  NOT NULL  DEFAULT NOW()

);

```

And let's fill it up with some dummy data, that roughly matches the

distribution of mine:

```

INSERT INTO a SELECT FROM generate_series(1, 5);

INSERT INTO b(a_id) SELECT (i % 5) + 1 FROM generate_series(1, 100) i;

INSERT INTO c(b_id) SELECT (trunc(random() * 100)+1) FROM

generate_series(1, 1000000);

```

And here's the query I want to do, efficiently:

````

SELECT * FROM c

  JOIN b ON b.id = c.b_id

  JOIN a ON a.id = b.a_id

WHERE a.id = 3

ORDER BY b.created_at DESC

LIMIT 10

```

There seems to simply be no index I can put on the data, that allows me

to run this query efficiently. Four hours of playing with this, the only

solution I can see is, normalizing table `c` by adding a field "b's

a_id" and then creating an index on  (bs_a_id, created_at).

But surely there's a better solution?

This is one problem with using made up surrogate keys...

The PK of A is a component of both the PK of B and the PK of C but you throw that information away by using serial fields for PKs instead.  You should have unique indexes on B and C that incorporate the ID from A and then indeed you will end up with a join sequence that can be executed against efficiently.

All that said you really should put indexes on the foreign keys...

I haven't run the code to see the actual plan....

David J.