Re: Am I best off keeping large chunks of text in a separate table?

Scott Marlowe <scott.marlowe@xxxxxxxxx> · Sat, 17 Sep 2011 19:02:22 -0600

On Sat, Sep 17, 2011 at 6:46 PM, Mike Christensen <mike@xxxxxxxxxxxxx> wrote:
> I have a table that looks something like this:
>
> url - character varying(1024)
> date - timestamptz
> body - text
>
> Url is a unique primary key.  Body can potentially be a couple hundred
> k of text.
>
> There will at first be perhaps 100,000 rows in this table, but at some
> point it might get into the millions.
>
> I need to be able to quickly insert into this table (I might be
> inserting several rows per second at times).  I also need to be able
> to very quickly see if a URL already exists in the table, and what the
> date value is.  Or, query for all "urls" that have a "date" older than
> x days.
>
> Am I better off with two tables such as:
>
> Table1:
> id - uuid or integer (primary key)
> url - unique index
> date
>
> Table2:
>
> id - FK to Table2.id
> body - text
>
> It makes the program flow a bit more complicated, and I'd have to use
> transactions and stuff when inserting new rows.  However, for years
> I've been told that having rows with large chunks of text is bad for
> perf and forces that data to be paged into memory and causes other
> various issues.  Any advice on this one?  Thanks!

What would be really cool is if postgresql took values for body that
were over a few k and compressed them and stored them out of line in
another table.  Luckily for you, that's EXACTLY what it already does.
http://www.postgresql.org/docs/9.1/static/storage-toast.html  Cool eh?

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general