On Mon, Nov 2, 2015 at 12:19 AM, Andrey Osenenko <andrey.osenenko@xxxxxxxxx> wrote: > > It also looks like if there was a way to create a table with just primary > key and add an index to it that indexes data from another table, it would > work much, much faster since there would be very little to read from disk > after index lookup. But looks like there isn't. There is a way to do this, but it is pretty gross. You can define function which takes the primary key as input and returns the data to index. Mark the function as immutable, even though it isn't. Something like: CREATE OR REPLACE FUNCTION public.filler_by_aid(integer) RETURNS text LANGUAGE sql IMMUTABLE AS $function$ select filler::text from pgbench_accounts where aid=$1 $function$ Then create a table which has just the primary key, and create a functional index on that table create table foobar as select aid from pgbench_accounts; create index on foobar (filler_by_aid(aid)); Now you can query the skinny table by reference to the data in the wide table: explain analyze select count(*) from foobar where filler_by_aid(aid)='zebra'; Since you fibbed to PostgreSQL about the functions immutability, it is pretty easy to get a corrupt index here. Every time the parent is updated, you have to be sure to delete and reinsert the primary key in the corresponding skinny table, otherwise it will not reflect the updated value. What you gain in the skinny table you could very well lose with the triggers needed to maintain it. Not to mention the fragility. It would be simpler if you could just force the wide data to always be toasted, even if it is not wide enough to trigger the default toast threshold. You could get a skinnier table (although not quite as skinny as one with only a single column), without having to do unsupported hacks. (I am assuming here, without real evidence other than intuition, that most of your news articles are in fact coming in under the toast threshold). > So am I correct in assumption that as the amount of rows grows, query times > for rows that are not in memory (and considering how many of them there are, > most won't be) will grow linearly? Yep. What you really need are index only scans. But those are not supported by gin indexes, and given the gin index structure it seems unlikely they will ever support index-only scans, at least not in a way that would help you. What are you going to do with these 23,000 primary keys once you get them, anyway? Perhaps you can push that analysis into the database and gain some efficiencies there. Or you could change your data structure. If all you are doing is searching for one tsvector token at a time, you could unnest ts_vector and store it in a table like (ts_token text, id_iu bigint). Then build a regular btree index on (ts_token, id_iu) and get index-only scans (once you upgrade from 9.1 to something newer) Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance