Extreme slowdown querying system tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A Chinese attacker out to steal massive amounts of our curated data attacked our website/database in a way that created a new user account on every access, which includes a couple tables per user. Bottom line: about 3.5 million tables were created. After we tightened our new-user signup procedure, all of the bogus accounts were deleted. Now the entire system (about 400 schemas) contains about 300,000 tables total, which is roughly where it's been for several years.

However, the system tables haven't recovered; see the highlighted time.

db=> \timing
Timing is on.
db=> select count(1) from pg_catalog.pg_attribute;
  count  
---------
 5199278
(1 row)

Time: 535712.996 ms

Obviously not acceptable. Among other things, it means that psql's auto-complete feature is completely broken (it freezes completely; I have to use pg_terminate_backend() to kill the pg_catalog query).

So I did a vacuum-analyze, which helped but is still not acceptable:

db=# vacuum analyze verbose pg_catalog.pg_attribute;
INFO:  vacuuming "pg_catalog.pg_attribute"
INFO:  scanned index "pg_attribute_relid_attnam_index" to remove 4176243 row versions
DETAIL:  CPU 2.53s/2.46u sec elapsed 340.06 sec
INFO:  scanned index "pg_attribute_relid_attnum_index" to remove 4176243 row versions
DETAIL:  CPU 2.61s/2.04u sec elapsed 253.21 sec
INFO:  "pg_attribute": removed 4176243 row versions in 116877 pages
DETAIL:  CPU 2.99s/1.66u sec elapsed 354.83 sec
INFO:  index "pg_attribute_relid_attnam_index" now contains 5210524 row versions in 270863 pages
DETAIL:  4176241 index row versions were removed.
231334 index pages have been deleted, 203555 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.30 sec.
INFO:  index "pg_attribute_relid_attnum_index" now contains 5211181 row versions in 190688 pages
DETAIL:  4176242 index row versions were removed.
160280 index pages have been deleted, 139653 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.17 sec.
INFO:  "pg_attribute": found 1091622 removable, 2922623 nonremovable row versions in 245372 out of 1415386 pages
DETAIL:  1395 dead row versions cannot be removed yet.
There were 6347913 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 12.21s/8.30u sec elapsed 1354.75 sec.
INFO:  analyzing "pg_catalog.pg_attribute"
INFO:  "pg_attribute": scanned 30000 of 1415386 pages, containing 106836 live rows and 49 dead rows; 30000 rows in sample, 21477499 estimated total rows
VACUUM
db=# \timing
Timing is on.
db=# select count(1) from pg_catalog.pg_attribute;
  count  
---------
 5219505
(1 row)

Time: 101105.252 ms

What now?

I'm reluctant to do a VACUUM FULL and REINDEX, for fear that an exclusive lock will halt the system (or that you shouldn't do this to system tables).

Thanks,
Craig


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux