Re: Index creation running now for 14 hours

Tory M Blue <tmblue@xxxxxxxxx> · Wed, 26 Aug 2015 13:26:30 -0700

On Wed, Aug 26, 2015 at 12:36 PM, Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote:

From: Tory M Blue [mailto:tmblue@xxxxxxxxx]

Sent: Wednesday, August 26, 2015 3:26 PM

To: Igor Neyman <ineyman@xxxxxxxxxxxxxx>

Cc: pgsql-performance <pgsql-performance@xxxxxxxxxxxxxx>

Subject: Re:  Index creation running now for 14 hours

On Wed, Aug 26, 2015 at 12:18 PM, Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote:

From:
pgsql-performance-owner@xxxxxxxxxxxxxx [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx]
On Behalf Of Tory M Blue

Sent: Wednesday, August 26, 2015 3:14 PM

To: pgsql-performance <pgsql-performance@xxxxxxxxxxxxxx>

Subject:  Index creation running now for 14 hours

I'm running 9.3.4 with slon 2.2.3, I did a drop add last night at 9pm, it started this particular tables index creation at 10:16pm and it's still running. 1 single core is at 100%
 (32 core box) and there is almost zero I/O activity.

CentOS 6.6

 16398 | clsdb | 25765 |       10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 |                 |       45712 | 2015-08-25 21:12:01.6
19819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  | 2015-08-25 22:16:03.100992-07 | f       | active | select "_cls".fini
shTableAfterCopy(143); analyze "torque"."impressions"; 
I was wondering if there were underlying tools to see how it's progressing, or if there is anything I can do to bump the performance mid creation? Nothing I can do really without
 stopping postgres or slon, but that would start me back at square one.

Thanks
Tory

i

Check pg_locks in regards to the table in question.

Regards,
Igor Neyman

thanks Igor I did, but not clear what that is telling me, there are 249 rows in there, nothing has a table name , they are all for the PID in the "analyze torque.impressions line that I listed above pid 25765.

Here is one for an exclusive lock, but what should I be looking for? There are no other processes on this box other than slon and this index creation.

 transactionid |          |            |      |       |            |      93588453 |         |       |          | 4/25823460         | 25765 | ExclusiveL

ock       | t       | f

Thanks

Tory 

There are objects OIDs in pg_lock, not names.
Find the OID of the table that you create your index for, and search pg_locks for the records referencing your table.
It cannot be that all records in pg_locks are for pid running “analyze”, there should be records with pid running your “create index”.
What’s the size of the table you are indexing?
Also, take a look at pg_stat_activity for long running transactions/queries.

Igor Neyman

the table is 90GB without indexes,  285GB with indexes and bloat,  The row count is not actually completing.. 125Million rows over 13 months, this table is probably close to 600million rows.

Yes I have long running queries, my job started last night at 9pm, it appears 3 of the 6 indexes on this table are completed, but I'm about to blow out my disk space, so I won't be able to give it much longer to run.... Bah!

16398 | clsdb | 25765 |       10 | postgres | slon.remoteWorkerThread_1 | 10.13.200.232 |                 |       45712 | 2015-08-25 21:12:01.619819-07 | 2015-08-25 21:22:08.68766-07  | 2015-08-25 22:16:03.10099-07  | 2015-08-25 22:16:03.100992-07 | f       | active | select "_cls".finishTableAfterCopy(143); analyze "torque"."impressions"; 

 16398 | lsdb | 25777 |       10 | postgres | slon.local_cleanup        | 10.13.200.232 |                 |       45718 | 2015-08-25 21:12:01.624032-07 | 2015-08-25 21:23:09.103395-07 | 2015-08-25 21:23:09.103395-07 | 2015-08-25 21:23:09.103397-07 | t       | active | begin;lock table "_cls".sl_config_lock;select "_cls".cleanupEvent('10 minutes'::interval);commit; 
there is nothing else in the pg_stat table other than a bunch of slony connections, these are the only 2 items that have been running since the index started last night at 10:16pm

2015-08-25 22:16:03 PDT CONFIG remoteWorkerThread_1: 67254824703 bytes copied for table "torque"."impressions"
The above is when it had finished copying the table and started on the index..
Well as I said I'm running out of storage as the index is creating some serious data on the filesystem, I'll have to kill it, try to massage the data a bit and increase the maintenance_work mem to use some of my 256GB of ram to try to get through this. Right now the 100% cpu process which is this index is only using 3.5GB and has been for the last 15 hours

Tory