Re: DELETE query and indexes again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I should have included this: when I use the first two cols from the PK (gid, guid) it uses the index. When I use the first col only (gid), it does not. Here's the index delete.

 Delete on play_statistics ps  (cost=0.69..472003.08 rows=28920 width=12) (actual time=3534.409..3534.409 rows=0 loops=1)
   ->  Nested Loop  (cost=0.69..472003.08 rows=28920 width=12) (actual time=16.703..1857.140 rows=2269013 loops=1)
         ->  Seq Scan on stats s  (cost=0.00..2253.24 rows=54624 width=26) (actual time=16.565..60.615 rows=21063 loops=1)
         ->  Index Scan using play_statistics_pkey on play_statistics ps  (cost=0.69..8.59 rows=1 width=26) (actual time=0.008..0.065 rows=108 loops=21063)
               Index Cond: ((gid = s.gid) AND (guid = s.guid))
 Planning Time: 1.494 ms
 JIT:
   Functions: 9
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 2.500 ms, Inlining 0.000 ms, Optimization 0.873 ms, Emission 15.309 ms, Total 18.682 ms
 Execution Time: 3569.225 ms
(11 rows)




On Thu, Jul 23, 2020 at 2:11 PM Wells Oliver <wells.oliver@xxxxxxxxx> wrote:
This play_statistics table has 96,577,179 rows. Its PK is gid integer, guid uuid, cat text, name text, target integer.

Doing a EXPLAIN ANALYZE DELETE WHERE gid IN (SELECT DISTINCT gid FROM temptable) where there is one row in the temptable, I get this:

Delete on play_statistics (cost=1984.14..8037586.36 rows=3734075 width=34) (actual time=407093.859..407093.859 rows=0 loops=1)
-> Hash Join (cost=1984.14..8037586.36 rows=3734075 width=34) (actual time=407093.857..407093.857 rows=0 loops=1)
Hash Cond: (play_statistics.gid = "ANY_subquery".gid)
-> Seq Scan on play_statistics (cost=0.00..7781818.16 rows=96600516 width=10) (actual time=201.521..396052.101 rows=96556116 loops=1)
-> Hash (cost=1981.64..1981.64 rows=200 width=32) (actual time=13.020..13.020 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Subquery Scan on "ANY_subquery" (cost=1977.64..1981.64 rows=200 width=32) (actual time=13.012..13.013 rows=1 loops=1)
-> HashAggregate (cost=1977.64..1979.64 rows=200 width=4) (actual time=13.001..13.002 rows=1 loops=1)
Group Key: stats.gid
-> Seq Scan on stats (cost=0.00..1924.91 rows=21091 width=4) (actual time=0.028..9.244 rows=21091 loops=1)
Planning Time: 0.657 ms
JIT:
Functions: 15
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 1.816 ms, Inlining 53.472 ms, Optimization 89.014 ms, Emission 58.759 ms, Total 203.060 ms
Execution Time: 407112.908 ms
(16 rows)


It takes 7 minutes to delete 21063 rows for one given gid value, which seems excessive given I think it should be using the PK index.

Anything I'm  missing here?

--
Wells Oliver
wells.oliver@xxxxxxxxx


--
Wells Oliver
wells.oliver@xxxxxxxxx

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux