Hello, I´ve got a table custom_data which essentially contains a number of key/value pairs. This table holds a large number (about 40M) of records and I need the distinct keys and values for some reasons. Selecting those distinct data takes a
couple of seconds, so I decided to maintain a separate lookup table for both the key and value data. The lookup tables are maintained by a trigger that reacts on inserts/updates/deletes on the original table. While checking the correctness of my trigger function
I noticed that the SQL query in the trigger function is surprisingly slow, taking about 5-6 seconds. When I ran the SQL query outside the trigger function it showed the expected performance and returned in a couple of milliseconds. Though the original table
is very large it holds only a small number of distinct key / value values: SELECT DISTINCT key FROM custom_data; >> 12 rows returned SELECT DISTINCT value FROM custom_data; >> 13 rows returned Here are the relveant information (function body of the trigger function reduced to show the behaviour): PostgreSQL Version: PostgreSQL 9.1.13, compiled by Visual C++ build 1500, 64-bit OS Version: Windows 7 64bit Scenario to reproduce the behaviour: EMS Solution SQL Manager: SQL Editor used to run SQL commands from an editor Server configuration: name current_setting source DateStyle ISO, DMY session default_text_search_config pg_catalog.german configuration file effective_cache_size 8GB configuration file lc_messages German_Germany.1252 configuration file lc_monetary German_Germany.1252 configuration file lc_numeric German_Germany.1252 configuration file lc_time German_Germany.1252 configuration file listen_addresses * configuration file log_destination stderr configuration file log_line_prefix %t configuration file log_timezone CET environment variable logging_collector on configuration file max_connections 100 configuration file max_stack_depth 2MB environment variable port 5432 configuration file shared_buffers 4GB configuration file statement_timeout 0 session TimeZone CET environment variable work_mem 64MB configuration file custom_data table definition: CREATE TABLE public.custom_data ( custom_data_id SERIAL, file_id INTEGER DEFAULT 0 NOT NULL, user_id INTEGER DEFAULT 0 NOT NULL, "timestamp" TIMESTAMP(0) WITHOUT TIME ZONE DEFAULT '1970-01-01 00:00:00'::timestamp without time zone NOT NULL,
key TEXT DEFAULT ''::text NOT NULL, value TEXT DEFAULT ''::text NOT NULL, CONSTRAINT pkey_custom_data PRIMARY KEY(custom_data_id), ) WITHOUT OIDS; CREATE INDEX idx_custom_data_key ON public.custom_data USING btree (key); CREATE INDEX idx_custom_data_value ON public.custom_data USING btree (value); CREATE TRIGGER on_custom_data_changed AFTER INSERT OR UPDATE OR DELETE
ON public.custom_data FOR EACH ROW EXECUTE PROCEDURE public.on_change_custom_data(); CREATE OR REPLACE FUNCTION public.on_change_custom_data () RAISE NOTICE 'Check custom data key start : %', timeofday(); END IF; postgreSQL log: HINWEIS: Check custom data key start : Fri Oct 30 11:56:41.785000 2015 CET << start of IF NOT EXIST (...) HINWEIS: Check custom data key end : Fri Oct 30 11:56:47.145000 2015 CET << end of IF NOT EXISTS (...) : ~5.4 seconds Same query run in SQL editor: SELECT 1 FROM custom_data WHERE key='key-1' As you can see there´s a huge runtime difference between the select query used in the trigger function and the one run from the SQL editor.
|