Re: Improving performance of select query

Rob Sargent <robjsargent@xxxxxxxxx> · Mon, 14 Dec 2020 09:36:45 -0700



> On Dec 14, 2020, at 4:47 AM, Thomas Kellerer <shammat@xxxxxxx> wrote:
> 
> Karthik Shivashankar schrieb am 14.12.2020 um 12:38:
>> I have a postgres(v9.5) table named customer holding 1 billion rows.
>> It is not partitioned but it has an index against the primary key
>> (integer). I need to keep a very few records (say, about 10k rows)
>> and remove everything else.
>> 
>> /insert into customer_backup select * from customer where customer_id in (<id1>,<id2>,..); /
>> 
>>  
>> 
>> If I go for something like above I'm afraid the insert-select may take a very long time as when I ran
>> 
>> /select count(*) from customer;/
>> 
>> it is taking about 45 minutes to return the count.
> 
> Well, you need to compare the time with the same condition you use in your
> CREATE TABLE .. AS SELECT statement,
> 
> e.g.:
> 
>   select count(*)
>   from customer
>   where id in (....);
> 
> Or:
> 
>   explain (analyze)
>   select *
>   from customer
>   where id in (....);
> 
> 
> Regards
> Thomas
> 
As for the actually copy of the specific records, I would ‘where exists’ (even possibly with a temp table of ids) rather than in(id1..id10000)

>