Am 20.02.20 21:41 schrieb(en) Adrian Klaver:
It would be nice to know what:
[snip]
represented in:
Dropping and re-creating the function is actually the last operation in the script. The function is /very/ simple (just a wrapper to hide all internals from "agent" clients): ---8<-------------------------------------------------------------------- DROP FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata jsonb, OUT errortext text, OUT vanished boolean); CREATE FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata jsonb, OUT errortext text, OUT vanished boolean) RETURNS record LANGUAGE plpgsql STABLE SECURITY DEFINER SET search_path TO 'public', 'pg_temp' AS $$ BEGIN SELECT r.data, r.metadata, r.errortext FROM results r INNER JOIN tasks USING(resultid) WHERE taskid = mytaskid LIMIT 1 INTO data, metadata, errortext; SELECT COUNT(*) = 0 FROM tasks WHERE taskid = mytaskid INTO vanished; END; $$; ALTER FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata jsonb, OUT errortext text, OUT vanished boolean) OWNER TO manager; REVOKE ALL ON FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata jsonb, OUT errortext text, OUT vanished boolean) FROM PUBLIC; GRANT ALL ON FUNCTION public.get_result2(mytaskid bigint, OUT data bytea, OUT metadata jsonb, OUT errortext text, OUT vanished boolean) TO "agent"; COMMIT; ---8<--------------------------------------------------------------------
The Postgres logs during and after restart might provide some info. Also the errors thrown when accessing the other function.
I attach the (slightly stripped down; I don't want to post ~100k…) log, starting with the very first error at 13:39:59.302 UTC. Prior to that line are *no* errors. I added a few [comments]. At 13:39:59.484 the error message changes, referring to an ancient function “retrieve_single_result()” which (according to the person who wrote the “agent” client) is *not* called. The clients try periodically poll “get_result2()”. At 13:42:00 the “systemctl restart” has been initiated. At 13:42:02 the database has been stopped and is restarted immediately, revealing one misconfigured client (should be harmless), but the cache lookup error persists. Thanks in advance for your help, Albrecht.
[*no* error prior to this line] 2020-02-20 13:39:59.302 UTC [27971] agent@taskdb ERROR: cache lookup failed for function 1821571 2020-02-20 13:39:59.302 UTC [27971] agent@taskdb CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type 2020-02-20 13:39:59.302 UTC [27971] agent@taskdb STATEMENT: SELECT data, metadata, errortext, vanished FROM get_result2(26671107) 2020-02-20 13:39:59.302 UTC [28048] agent@taskdb ERROR: cache lookup failed for function 1821571 2020-02-20 13:39:59.302 UTC [28048] agent@taskdb CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type 2020-02-20 13:39:59.302 UTC [28048] agent@taskdb STATEMENT: SELECT data, metadata, errortext, vanished FROM get_result2(26670465) 2020-02-20 13:39:59.302 UTC [28021] agent@taskdb ERROR: cache lookup failed for function 1821571 2020-02-20 13:39:59.302 UTC [28021] agent@taskdb CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type 2020-02-20 13:39:59.302 UTC [28021] agent@taskdb STATEMENT: SELECT data, metadata, errortext, vanished FROM get_result2(26670465) [… repeating again and again, with different arguments for the DB function] 2020-02-20 13:39:59.333 UTC [28017] agent@taskdb ERROR: cache lookup failed for function 1821571 2020-02-20 13:39:59.333 UTC [28017] agent@taskdb CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type 2020-02-20 13:39:59.333 UTC [28017] agent@taskdb STATEMENT: SELECT data, metadata, errortext, vanished FROM get_result2(26671517) 2020-02-20 13:39:59.335 UTC [27999] agent@taskdb ERROR: cache lookup failed for function 1821571 2020-02-20 13:39:59.335 UTC [27999] agent@taskdb CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type 2020-02-20 13:39:59.335 UTC [27999] agent@taskdb STATEMENT: SELECT data, metadata, errortext, vanished FROM get_result2(26671570) 2020-02-20 13:39:59.484 UTC [2504] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:39:59.484 UTC [2504] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:39:59.484 UTC [2504] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26619789) 2020-02-20 13:39:59.504 UTC [2504] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:39:59.504 UTC [2504] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:39:59.504 UTC [2504] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26618953) 2020-02-20 13:39:59.514 UTC [2504] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:39:59.514 UTC [2504] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:39:59.514 UTC [2504] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26483130) […repeating again and again, with different arguments for the DB function] 2020-02-20 13:41:45.469 UTC [2504] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:41:45.469 UTC [2504] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:41:45.469 UTC [2504] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26619490) 2020-02-20 13:42:00.000 UTC [2036] LOG: cron job 1 starting: SELECT FROM task_watchdog() 2020-02-20 13:42:00.010 UTC [2036] LOG: cron job 1 completed: 1 row 2020-02-20 13:42:00.404 UTC [2027] LOG: received fast shutdown request 2020-02-20 13:42:00.404 UTC [2027] LOG: aborting any active transactions 2020-02-20 13:42:00.405 UTC [28051] agent@taskdb FATAL: terminating connection due to administrator command 2020-02-20 13:42:00.405 UTC [28055] agent@taskdb FATAL: terminating connection due to administrator command […many more clients being disconnected…] 2020-02-20 13:42:00.407 UTC [28036] agent@taskdb FATAL: terminating connection due to administrator command 2020-02-20 13:42:00.407 UTC [7065] agent@taskdb FATAL: terminating connection due to administrator command 2020-02-20 13:42:00.407 UTC [2036] LOG: pg_cron scheduler shutting down 2020-02-20 13:42:00.407 UTC [28042] agent@taskdb FATAL: terminating connection due to administrator command 2020-02-20 13:42:00.408 UTC [27987] agent@taskdb FATAL: terminating connection due to administrator command […still disconnecting clients…] 2020-02-20 13:42:00.433 UTC [20299] agent@taskdb FATAL: terminating connection due to administrator command 2020-02-20 13:42:00.434 UTC [20296] agent@taskdb FATAL: terminating connection due to administrator command […] 2020-02-20 13:42:00.713 UTC [1895] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:00.717 UTC [1896] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:00.725 UTC [1897] agent@taskdb FATAL: the database system is shutting down […and many more of these…] 2020-02-20 13:42:00.729 UTC [1923] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:00.738 UTC [1928] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:00.742 UTC [2027] LOG: background worker "logical replication launcher" (PID 2037) exited with exit code 1 2020-02-20 13:42:00.745 UTC [1902] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:00.747 UTC [1900] agent@taskdb FATAL: the database system is shutting down […] 2020-02-20 13:42:01.489 UTC [1958] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:01.491 UTC [1956] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:01.494 UTC [2031] LOG: shutting down 2020-02-20 13:42:01.497 UTC [1957] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:01.498 UTC [1955] agent@taskdb FATAL: the database system is shutting down […] 2020-02-20 13:42:01.766 UTC [1988] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:01.940 UTC [1989] agent@taskdb FATAL: the database system is shutting down 2020-02-20 13:42:01.986 UTC [2027] LOG: database system is shut down [as I initiated s 'systemctl restart', the system is started immediately] 2020-02-20 13:42:02.214 UTC [1995] LOG: listening on IPv4 address "0.0.0.0", port 5432 2020-02-20 13:42:02.214 UTC [1995] LOG: listening on IPv6 address "::", port 5432 2020-02-20 13:42:02.215 UTC [1995] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-02-20 13:42:02.243 UTC [1996] LOG: database system was shut down at 2020-02-20 13:42:01 UTC 2020-02-20 13:42:02.265 UTC [1995] LOG: database system is ready to accept connections 2020-02-20 13:42:02.284 UTC [1997] agent@taskdb FATAL: the database system is starting up 2020-02-20 13:42:02.287 UTC [2005] agent@taskdb FATAL: no pg_hba.conf entry for host "172.16.98.8", user "agent", database "taskdb", SSL off [misconfigured client, the message is ok] 2020-02-20 13:42:02.292 UTC [2003] LOG: pg_cron scheduler started 2020-02-20 13:42:02.782 UTC [2012] [unknown]@[unknown] LOG: incomplete startup packet 2020-02-20 13:42:20.350 UTC [2133] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:42:20.350 UTC [2133] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:42:20.350 UTC [2133] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26619789) 2020-02-20 13:42:20.374 UTC [2133] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:42:20.374 UTC [2133] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:42:20.374 UTC [2133] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26618953) 2020-02-20 13:42:20.388 UTC [2133] agent@taskdb ERROR: error result "Error: cache lookup failed for function 1821571 CONTEXT: PL/pgSQL function get_result2(bigint) while casting return value to function's return type " 2020-02-20 13:42:20.388 UTC [2133] agent@taskdb CONTEXT: PL/pgSQL function retrieve_single_result(bigint) line 7 at RAISE 2020-02-20 13:42:20.388 UTC [2133] agent@taskdb STATEMENT: SELECT data, metadata FROM retrieve_single_result(26483130) […continues with the same error again and again]
Attachment:
pgpkuQqYceLXr.pgp
Description: PGP signature