In Postgres 8.4, I have a table called java_types with two columns,
package_name and class_name. There is another table called
java_objects that defines a column called type whose value matches
the concatenation of package_name and class_name. A typical join and
result looks like this: SELECT package_name, class_name, type FROM java_objects o INNER JOIN java_types t ON (t.package_name || '.' || t.class_name) = o.type; package_name, class_name, type "java.lang" , "String" , "java.lang.String" The above works, although it is slow for large data sets so I defined the following index: CREATE INDEX java_type_hash ON java_types USING hash (((package_name::text || '.'::text) || class_name::text)); I confirmed that my new index is being used by inspecting the query plan and finding the following: -> Index Scan using java_type_hash on java_types (cost=0.00..1.22 rows=1 width=49) Index Cond: ((((java_types.package_name)::text || '.'::text) || (java_types.class_name)::text) = (java_objects.type)::text) This gave me a speed boost and worked initially. A day after defining the index, however, I inserted a few rows into java_types and then many rows into java_objects. When I ran the aforementioned query I got 0 results. A month later (without restarting postgres), I ran the same query and still got 0 results. I suspected the index had either been corrupted or not updated automatically. I confirmed this by running the following: REINDEX INDEX java_type_hash; And then when I re-ran the same query the correct results came back! I thought that maybe my system wasn't set to autovacuum but a query for my current settings confirmed that it was: autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 256 autovacuum_freeze_max_age = 200000000 autovacuum_max_workers = 3 autovacuum_naptime = 60 autovacuum_vacuum_cost_delay = 20 autovacuum_vacuum_cost_limit = -1 autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 512 track_counts = on Finally, this query gave me some extra information about when my system had last been vacuumed/analyzed: SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze FROM pg_stat_all_tables WHERE relname = 'java_types' rename , last_vacuum , last_autovacuum , last_analyze , last_autoanalyze "java_types", "2011-10-04 13:37:03.867-07", "" , "2011-10-04 13:21:22.342-07", "2011-08-30 10:15:13.018-07" The index was created on 8/30/2011, inserts were done on java_types and java_objects on 8/31/2011, and then 10/4/2011 is when I manually ran REINDEX (and later VACUUM ANALYZE). So after all that, the questions is: If I have autovacuum set to true then I shouldn't have to worry about calling VACUUM/ANALYZE/REINDEX manually to update my java_types_hash index, right? Maybe my settings are wrong or I'm making an incorrect assumption about Postgres behavior. Any pointer in the right direction helps. Thanks in advance. --
Justin Naifeh Software Developer
www.terraframe.com/products/runwaysdk |