On Sat, 01 Sep 2007 23:00:10 -0400 Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Shawn <postgres@xxxxxxxxxxxxxxx> writes: > > The query just ran and here is the basic output: > > > UPDATE 15445 > > Time: 22121.141 ms > > > and > > > explain ANALYZE update shawns_data set alias = null; > > QUERY > > PLAN > > ----------------------------------------------------------------------------------------------------------------- > > Seq Scan on shawns_data (cost=0.00..465.45 rows=15445 width=480) > > (actual time=0.034..67.743 rows=15445 loops=1) Total runtime: > > 1865.002 ms (2 rows) > > Hmmm ... did you run the real query and the EXPLAIN in immediate > succession? If so, the only reason I can think of for the speed > difference is that all the rows were fetched already for the second > run. Which doesn't make a lot of sense given the hardware specs > you mentioned. Try watching "vmstat 1" and see if there's some > noticeable difference in the behavior. > Actually no, The runs were at least an 24 hours apart. Running the query standalone was quite interesting. There's a lot of disk output: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 2580 450992 84420 16007864 0 0 1 1 1 2 3 0 94 2 0 0 2580 450992 84420 16007864 0 0 0 0 1003 18 0 0 100 0 0 0 2580 450660 84420 16007864 0 0 0 6764 1127 202 0 0 100 0 0 0 2580 450660 84420 16007864 0 0 0 0 1004 28 0 0 100 0 0 0 2580 450660 84420 16007864 0 0 0 0 1011 58 0 0 100 0 0 0 2580 450660 84420 16007864 0 0 0 0 1011 50 0 0 100 0 0 0 2580 450660 84428 16007864 0 0 0 32 1011 53 0 0 100 0 0 0 2580 450660 84428 16007864 0 0 0 3034 1095 191 0 0 100 0 0 0 2580 450164 84428 16007864 0 0 0 0 1032 362 0 0 100 0 0 0 2580 450164 84428 16007864 0 0 0 0 1002 22 0 0 100 0 1 0 2580 446444 84428 16012240 0 0 0 2568 1024 398 4 1 95 0 1 0 2580 456172 84428 16001760 0 0 8 11549 1136 616 10 3 87 0 0 0 2580 452164 84428 16005524 0 0 4 26580 1885 1479 4 2 93 1 0 0 2580 452164 84436 16005516 0 0 0 40 1004 26 0 0 100 0 0 0 2580 452164 84436 16005524 0 0 0 0 1001 36 0 0 100 0 0 0 2580 452164 84436 16005524 0 0 0 0 1001 24 0 0 100 0 0 0 2580 452164 84436 16005524 0 0 0 0 1001 36 0 0 100 0 0 0 2580 452164 84436 16005524 0 0 0 3 1005 27 0 0 100 0 0 0 2580 452040 84436 16005524 0 0 0 0 1006 41 0 0 100 0 0 0 2580 452040 84436 16005524 0 0 0 0 1002 20 0 0 100 0 Also it runs a lot faster by itself, so I am thinking there is some query early on that is really fragmenting the table. Here's the whole script: \timing select now(); create temporary table unit_operation(lid varchar(5), hang_time varchar(3),multi_csd_enabled bool,multi_node_enabled bool,wide_area_enabled bool); create temporary table unit(lid varchar(5),alias char(8),type smallint); \copy unit from '/tmp/wc_unit' \copy unit_operation from '/tmp/wc_unit_operation' truncate csd; insert into csd(lid,alias,type) select lid,alias,type from unit; update csd set hang_time = uo.hang_time, multi_csd_enabled = uo.multi_csd_enabled, multi_node_enabled = uo.multi_node_enabled, wide_area_enabled = uo.wide_area_enabled from unit_operation uo where csd.lid = uo.lid; create temporary table groups(lid varchar(5),alias varchar(8)); create temporary table group_operation(lid varchar(15),hang_time varchar(8)); \copy groups from '/tmp/wc_groups' \copy group_operation from '/tmp/wc_group_operation' truncate gid; insert into gid(gid,alias) select lid,alias from groups; update gid set hang_time = go.hang_time from group_operation go where gid.gid = go.lid; truncate rc_unit_state; \copy rc_unit_state from '/tmp/wc_rc_unit_state' with null as '' update lidrange set csd = false, data = false, disable = false, lost = false; update lidrange set csd = true from csd where lidrange.lid = csd.lid; update lidrange set data = true from shawns_data where lidrange.lid = shawns_data.lid; update lidrange set disable = true from rc_unit_state where lidrange.lid = rc_unit_state.lid; update shawns_data set alias = null where alias is not null; update shawns_data set alias = csd.alias from csd where shawns_data.lid = csd.lid; update lidrange set lost = true from shawns_data where shawns_data.lid = lidrange.lid and shawns_data.lost = true \i '/home/shawn/vacuum_data.sql' As you can see there is nothing really accessing the shawns_data table before the query in question. I have also allocated 380 Megs of RAM to any virtual process, I can't see where any of these tables could take up this much RAM. Here is my postgresql.conf: # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Please note that re-commenting a setting is NOT sufficient to revert it # to the default value, unless you restart the server. # # Any option can also be given as a command line switch to the server, # e.g., 'postgres -c log_connections=on'. Some options can be changed at # run-time with the 'SET' SQL command. # # This file is read on server startup and when the server receives a # SIGHUP. If you edit the file on a running system, you have to SIGHUP the # server for the changes to take effect, or use "pg_ctl reload". Some # settings, which are marked below, require a server shutdown and restart # to take effect. # # Memory units: kB = kilobytes MB = megabytes GB = gigabytes # Time units: ms = milliseconds s = seconds min = minutes h = hours d = days #--------------------------------------------------------------------------- # FILE LOCATIONS #--------------------------------------------------------------------------- # The default values of these variables are driven from the -D command line # switch or PGDATA environment variable, represented here as ConfigDir. #data_directory = 'ConfigDir' # use data in another directory # (change requires restart) #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file # (change requires restart) #ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file # (change requires restart) # If external_pid_file is not explicitly set, no extra PID file is written. #external_pid_file = '(none)' # write an extra PID file # (change requires restart) #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - #listen_addresses = 'localhost' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all # (change requires restart) #port = 5432 # (change requires restart) max_connections = 20 # (change requires restart) # Note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. #superuser_reserved_connections = 3 # (change requires restart) #unix_socket_directory = '' # (change requires restart) #unix_socket_group = '' # (change requires restart) #unix_socket_permissions = 0777 # octal # (change requires restart) #bonjour_name = '' # defaults to the computer name # (change requires restart) # - Security & Authentication - #authentication_timeout = 1min # 1s-600s #ssl = off # (change requires restart) #password_encryption = on #db_user_namespace = off # Kerberos #krb_server_keyfile = '' # (change requires restart) #krb_srvname = 'postgres' # (change requires restart) #krb_server_hostname = '' # empty string matches any keytab entry # (change requires restart) #krb_caseins_users = off # (change requires restart) # - TCP Keepalives - # see 'man 7 tcp' for details #tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count = 0 # TCP_KEEPCNT; # 0 selects the system default #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 300MB # min 128kB or max_connections*16kB # (change requires restart) #temp_buffers = 8MB # min 800kB #max_prepared_transactions = 5 # can be 0 or more # (change requires restart) # Note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 300MB # min 64kB maintenance_work_mem = 260MB # min 1MB max_stack_depth = 7MB # min 100kB # - Free Space Map - max_fsm_pages = 15360000 # min max_fsm_relations*16, 6 bytes each # (change requires restart) #max_fsm_relations = 1000 # min 100, ~70 bytes each # (change requires restart) # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 # (change requires restart) #shared_preload_libraries = '' # (change requires restart) # - Cost-Based Vacuum Delay - #vacuum_cost_delay = 0 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-10000 credits #vacuum_cost_page_miss = 10 # 0-10000 credits #vacuum_cost_page_dirty = 20 # 0-10000 credits #vacuum_cost_limit = 200 # 0-10000 credits # - Background writer - #bgwriter_delay = 200ms # 10-10000ms between rounds #bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round #bgwriter_lru_maxpages = 5 # 0-1000 buffers max written/round #bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round #bgwriter_all_maxpages = 5 # 0-1000 buffers max written/round #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - #fsync = on # turns forced synchronization on or off #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync #full_page_writes = on # recover from partial page writes #wal_buffers = 64kB # min 32kB # (change requires restart) #commit_delay = 0 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 # - Checkpoints - checkpoint_segments = 50 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_warning = 30s # 0 is off # - Archiving - #archive_command = '' # command to use to archive a logfile segment #archive_timeout = 0 # force a logfile segment switch after this # many seconds; 0 is off #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Configuration - #enable_bitmapscan = on #enable_hashagg = on #enable_hashjoin = on #enable_indexscan = on #enable_mergejoin = on #enable_nestloop = on #enable_seqscan = on #enable_sort = on #enable_tidscan = on # - Planner Cost Constants - #seq_page_cost = 1.0 # measured on an arbitrary scale #random_page_cost = 4.0 # same scale as above #cpu_tuple_cost = 0.01 # same scale as above #cpu_index_tuple_cost = 0.005 # same scale as above #cpu_operator_cost = 0.0025 # same scale as above #effective_cache_size = 128MB # - Genetic Query Optimizer - #geqo = on #geqo_threshold = 12 #geqo_effort = 5 # range 1-10 #geqo_pool_size = 0 # selects default based on effort #geqo_generations = 0 # selects default based on effort #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 #constraint_exclusion = off #from_collapse_limit = 8 #join_collapse_limit = 8 # 1 disables collapsing of explicit # JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Where to Log - #log_destination = 'stderr' # Valid values are combinations of # stderr, syslog and eventlog, # depending on platform. # This is used when logging to stderr: #redirect_stderr = off # Enable capturing of stderr into log # files # (change requires restart) # These are only used if redirect_stderr is on: #log_directory = 'pg_log' # Directory where log files are written # Can be absolute or relative to PGDATA log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern. # Can include strftime() escapes log_truncate_on_rotation = off # If on, any existing log file of the same # name as the new log file will be # truncated rather than appended to. But # such truncation only occurs on # time-driven rotation, not on restarts # or size-driven rotation. Default is # off, meaning append to existing files # in all cases. #log_rotation_age = 1m # Automatic rotation of logfiles will # happen after that time. 0 to # disable. log_rotation_size = 10MB # Automatic rotation of logfiles will # happen after that much log # output. 0 to disable. # These are relevant when logging to syslog: #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - #client_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #log_error_verbosity = default # terse, default, or verbose messages #log_min_error_statement = error # Values in order of increasing severity: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # fatal # panic (effectively off) #log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements # and their durations. #silent_mode = off # DO NOT USE without syslog or # redirect_stderr # (change requires restart) # - What to Log - #debug_print_parse = off #debug_print_rewritten = off #debug_print_plan = off #debug_pretty_print = off #log_connections = off #log_disconnections = off #log_duration = off #log_line_prefix = '' # Special values: # %u = user name # %d = database name # %r = remote host and port # %h = remote host # %p = PID # %t = timestamp (no milliseconds) # %m = timestamp with milliseconds # %i = command tag # %c = session id # %l = session line number # %s = session start timestamp # %x = transaction id # %q = stop here in non-session # processes # %% = '%' # e.g. '<%u%%%d> ' #log_statement = 'none' # none, ddl, mod, all #log_hostname = off #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Query/Index Statistics Collector - #stats_command_string = on #update_process_title = on #stats_start_collector = on # needed for block or row stats # (change requires restart) #stats_block_level = off #stats_row_level = off #stats_reset_on_server_start = off # (change requires restart) # - Statistics Monitoring - #log_parser_stats = off #log_planner_stats = off #log_executor_stats = off #log_statement_stats = off #--------------------------------------------------------------------------- # AUTOVACUUM PARAMETERS #--------------------------------------------------------------------------- #autovacuum = off # enable autovacuum subprocess? # 'on' requires stats_start_collector # and stats_row_level to also be on #autovacuum_naptime = 1min # time between autovacuum runs #autovacuum_vacuum_threshold = 500 # min # of tuple updates before # vacuum #autovacuum_analyze_threshold = 250 # min # of tuple updates before # analyze #autovacuum_vacuum_scale_factor = 0.2 # fraction of rel size before # vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of rel size before # analyze #autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum # (change requires restart) #autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # autovacuum, -1 means use # vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovacuum, -1 means use # vacuum_cost_limit #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '"$user",public' # schema names #default_tablespace = '' # a tablespace name, '' uses # the default #check_function_bodies = on #default_transaction_isolation = 'read committed' #default_transaction_read_only = off #statement_timeout = 0 # 0 is disabled #vacuum_freeze_min_age = 100000000 # - Locale and Formatting - datestyle = 'iso, mdy' #timezone = unknown # actually, defaults to TZ # environment setting #timezone_abbreviations = 'Default' # select the set of available timezone # abbreviations. Currently, there are # Default # Australia # India # However you can also create your own # file in share/timezonesets/. #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database # encoding # These settings are initialized by initdb -- they might be changed lc_messages = 'en_US' # locale for system error message # strings lc_monetary = 'en_US' # locale for monetary formatting lc_numeric = 'en_US' # locale for number formatting lc_time = 'en_US' # locale for time formatting # - Other Defaults - #explain_pretty_print = on #dynamic_library_path = '$libdir' #local_preload_libraries = '' #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- #deadlock_timeout = 1s #max_locks_per_transaction = 64 # min 10 # (change requires restart) # Note: each lock table slot uses ~270 bytes of shared memory, and there are # max_locks_per_transaction * (max_connections + max_prepared_transactions) # lock table slots. #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = off #array_nulls = on #backslash_quote = safe_encoding # on, off, or safe_encoding #default_with_oids = off #escape_string_warning = on #standard_conforming_strings = off #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = on # - Other Platforms & Clients - #transform_null_equals = off #--------------------------------------------------------------------------- # CUSTOMIZED OPTIONS #--------------------------------------------------------------------------- #custom_variable_classes = '' # list of custom variable class names Any suggestions to improve things? Shawn ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster