> I have a table with 9,961,914 rows in it (see the describe of > bigtab_stats_fact_tmp14 below) > > I also have a table with 7,785 rows in it (see the describe of xsegment_dim > below) > > I'm running the join shown below and it takes > 10 hours and eventually runs > out of disk space on a 1.4TB file system > > I've included below a describe of both tables, the join and an explain plan, > any help / suggestions would be much appreciated ! > > I need to get this beast to run as quickly as possible (without filling up > my file system) > > > Thanks in advance... What version of postgresql are you using? According to http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY you may benefit from adjusting work_mem. You also index segment_srcid (in table xsegment_dim) but if you search for NULL and you have enough of those it defaults to a seq. scan: Seq Scan on xsegment_dim segdim (cost=0.00..1354.85 rows=3643 width=40) > Filter: (segment_srcid IS NULL) Maby you could insert some default value into segment_srcid (some arbitrary large numbers) instead of NULL and then search for values greater than?? You could also try to lower random_page_cost from default to 2. > select > f14.xpublisher_dim_id, > f14.xtime_dim_id, > f14.xlocation_dim_id, > f14.xreferrer_dim_id, > f14.xsite_dim_id, > f14.xsystem_cfg_dim_id, > f14.xaffiliate_dim_id, > f14.customer_id, > pf_dts_id, > episode_id, > sessionid, > bytes_received, > bytes_transmitted, > total_played_time_sec, > segdim.xsegment_dim_id as episode_level_segid > from > bigtab_stats_fact_tmp14 f14, > xsegment_dim segdim > where > f14.customer_id = segdim.customer_srcid > and f14.show_id = segdim.show_srcid > and f14.season_id = segdim.season_srcid > and f14.episode_id = segdim.episode_srcid > and segdim.segment_srcid is NULL; > > > > > > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Merge Join (cost=1757001.74..73569676.49 rows=3191677219 width=118) > Merge Cond: ((segdim.episode_srcid = f14.episode_id) AND > (segdim.customer_srcid = f14.customer_id) AND (segdim.show_srcid = > f14.show_id) AND (segdim.season_srcid = f14.season_id)) > -> Sort (cost=1570.35..1579.46 rows=3643 width=40) > Sort Key: segdim.episode_srcid, segdim.customer_srcid, segdim.show_srcid, > segdim.season_srcid > -> Seq Scan on xsegment_dim segdim (cost=0.00..1354.85 rows=3643 width=40) > Filter: (segment_srcid IS NULL) > -> Sort (cost=1755323.26..1780227.95 rows=9961874 width=126) > Sort Key: f14.episode_id, f14.customer_id, f14.show_id, f14.season_id > -> Seq Scan on bigtab_stats_fact_tmp14 f14 (cost=0.00..597355.74 > rows=9961874 width=126) > (9 rows) > > > > > > > > > > # \d bigtab_stats_fact_tmp14 > Table "public.bigtab_stats_fact_tmp14" > Column | Type | Modifiers > --------------------------+-----------------------------+----------- > pf_dts_id | bigint | > pf_device_id | bigint | > segment_id | bigint | > cdn_id | bigint | > collector_id | bigint | > digital_envoy_id | bigint | > maxmind_id | bigint | > quova_id | bigint | > website_id | bigint | > referrer_id | bigint | > affiliate_id | bigint | > custom_info_id | bigint | > start_dt | timestamp without time zone | > total_played_time_sec | numeric(18,5) | > bytes_received | bigint | > bytes_transmitted | bigint | > stall_count | integer | > stall_duration_sec | numeric(18,5) | > hiccup_count | integer | > hiccup_duration_sec | numeric(18,5) | > watched_duration_sec | numeric(18,5) | > rewatched_duration_sec | numeric(18,5) | > requested_start_position | numeric(18,5) | > requested_stop_position | numeric(18,5) | > post_position | numeric(18,5) | > is_vod | numeric(1,0) | > sessionid | bigint | > create_dt | timestamp without time zone | > segment_type_id | bigint | > customer_id | bigint | > content_publisher_id | bigint | > content_owner_id | bigint | > episode_id | bigint | > duration_sec | numeric(18,5) | > device_id | bigint | > os_id | bigint | > browser_id | bigint | > cpu_id | bigint | > xsystem_cfg_dim_id | bigint | > xreferrer_dim_id | bigint | > xaffiliate_dim_id | bigint | > xsite_dim_id | bigint | > xpublisher_dim_id | bigint | > season_id | bigint | > show_id | bigint | > xsegment_dim_id | bigint | > location_id | bigint | > zipcode | character varying(20) | > xlocation_dim_id | bigint | > location_srcid | bigint | > timezone | real | > xtime_dim_id | bigint | > Indexes: > "bigtab_stats_fact_tmp14_idx1" btree (customer_id) > "bigtab_stats_fact_tmp14_idx2" btree (show_id) > "bigtab_stats_fact_tmp14_idx3" btree (season_id) > "bigtab_stats_fact_tmp14_idx4" btree (episode_id) > > > > > > > # \d xsegment_dim > Table "public.xsegment_dim" > Column | Type | > Modifiers > ----------------------+-----------------------------+------------------------------------------------------------- > xsegment_dim_id | bigint | not null default > nextval('xsegment_dim_seq'::regclass) > customer_srcid | bigint | not null > show_srcid | bigint | not null > show_name | character varying(500) | not null > season_srcid | bigint | not null > season_name | character varying(500) | not null > episode_srcid | bigint | not null > episode_name | character varying(500) | not null > segment_type_id | integer | > segment_type | character varying(500) | > segment_srcid | bigint | > segment_name | character varying(500) | > effective_dt | timestamp without time zone | not null default now() > inactive_dt | timestamp without time zone | > last_update_dt | timestamp without time zone | not null default now() > Indexes: > "xsegment_dim_pk" PRIMARY KEY, btree (xsegment_dim_id) > "seg1" btree (customer_srcid) > "seg2" btree (show_srcid) > "seg3" btree (season_srcid) > "seg4" btree (episode_srcid) > "seg5" btree (segment_srcid) > "xsegment_dim_ix1" btree (customer_srcid) > > > > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare