Re: Is my database now too big?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Scott Marlowe wrote:
On 10/7/07, Darren Reed <darrenr@xxxxxxxxxxxx> wrote:
> Scott Marlowe wrote:
> > ...
> >
> > Any reasonably modern version of pgsql should simply stop accepting
> > requests rather than suffering loss due to txid wraparound.So,I can
> > think of two possibilities here. Bad hardware or operator error.
> >
> > Assuming you've checked out your machine thoroughly for bad hardware,
> > I can see a scenario where one does something like:
> >
> > begin;
> > create table xyz;
> > load 10,000,000 rows
> > manipulate rows
> > shutdown db without committing
> > start database
> > voila, table xyz is gone, and rightly so.
> >
> > Got more detailed info on what you're doing?
>
> That does describe what was happening (I haven't used BEGIN/COMMIT.)

then it isn't the same thing.  If you did a begin, then did everything
else without commit, the table would rightly disappear.

Right, I'm with you on that.
A few days ago I did:
pg_dumpall > foo
What I was doing yesterday was:
rm -rf /data/db/*
initdb -D /data/db
start
psql < foo
run for some period
stop
reboot
start
...tables have gone but disk space is still in use.
I dont know if it was during the period of running that the
database got corrupted (interrupted insert/update/query?)
or what happened.


> Nothing very special, I thought...
>
> But, doing "SELECT * FROM ifl LIMIT 1;" causes postgres to grow its
> process to 2GB and then die because the OS ran out of swap!

I doubt that exact query is causing the db to run out of memory,
unless ifl is a complex view or something.

Can you be more specific on what exact query causes the problem to show up?

It turned out that _any_ query on that table caused the problem to show up.

I couldn't even do "DROP TABLE ifl;" without postgres growing until it ran out of memory.

So in the end, I wiped it clean and reloaded the data - this time bounding all of the work with BEGIN/COMMIT. So far things are looking better. All of the data I've been building the tables with is elsewhere, so I can reconstruct it. Maybe adding
BEGIN/COMMIT makes no difference to not using them before, but I'm curious
to see if it does. Ideally I'd like to get to a place where I don't need to use vacuum
at all.

> Actually, this is a table that sees a lot of INSERT/DELETE (it's a place to
> store work to be done and bits get removed when completed) and I haven't
> been using BEGIN/COMMIT.  This is how postgres currently handles it:
>
> LOG:  database system was not properly shut down; automatic recovery in
> progress
> LOG:  record with zero length at 0/891157C8
> LOG:  redo is not required
> LOG:  database system is ready
> LOG:  transaction ID wrap limit is 2147484146, limited by database
> "postgres"
> LOG:  unexpected EOF on client connection
> LOG:  server process (PID 7212) was terminated by signal 9
> LOG:  terminating any other active server processes
> WARNING:  terminating connection because of crash of another server process

Looks like some query is running the server out of memory.  Normally,
postgresql will spill to disk if it needs more memory, unless it's
miconfigured.

Yes. I tried increasing the swap space but that just meant it grew larger...from limit:
datasize     3145728 kbytes

This is from NetBSD 4.99. I ended up running with 3.5GB of SWAP and 1.5GB of RAM.


> I'm modifying the work to use BEGIN/COMMIT, but the ifl table worries me...
> I can't seem to do anything with it that doesn't cause postgres to crap
> out ;(

begin/commit ain't the problem here.  Looks like you've either got
pgsql set to use too much memory or it's choosing a bad plan where it
thinks something will fit in memory but it won't.

I have no other problems with any of the other tables and it is only a small table (at the time
it should have had less than 5000 rows.)

Have you been analyzing your data before you start working on it?

No.

Can we see your postgresql.conf file?

Sure, I've attached it.
I've also run with the "default" .conf file without tuning it (down.)

Darren

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# "pg_ctl reload". Some settings, such as listen_addresses, require
# a postmaster shutdown and restart to take effect.


#---------------------------------------------------------------------------
# FILE LOCATIONS
#---------------------------------------------------------------------------

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'		# use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf'	# host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf'	# IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'		# write an extra pid file


#---------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#---------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'		# what IP address(es) to listen on; 
					# comma-separated list of addresses;
					# defaults to 'localhost', '*' = all
#port = 5432
max_connections = 15
# note: increasing max_connections costs ~400 bytes of shared memory per 
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777		# octal
#bonjour_name = ''			# defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60		# 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''		# empty string matches any keytab entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0		# TCP_KEEPIDLE, in seconds;
					# 0 selects the system default
#tcp_keepalives_interval = 0		# TCP_KEEPINTVL, in seconds;
					# 0 selects the system default
#tcp_keepalives_count = 0		# TCP_KEEPCNT;
					# 0 selects the system default


#---------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#---------------------------------------------------------------------------

# - Memory -

shared_buffers = 2000			# min 16 or max_connections*2, 8KB each
temp_buffers = 200			# min 100, 8KB each
max_prepared_transactions = 5		# can be 0 or more
# note: increasing max_prepared_transactions costs ~600 bytes of shared memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 4096			# min 64, size in KB
maintenance_work_mem = 8192		# min 1024, size in KB
max_stack_depth = 400			# min 100, size in KB

# - Free Space Map -

max_fsm_pages = 20000		# min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 200		# min 100, ~70 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 25		# min 25
#preload_libraries = ''

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0			# 0-1000 milliseconds
#vacuum_cost_page_hit = 1		# 0-10000 credits
#vacuum_cost_page_miss = 10		# 0-10000 credits
#vacuum_cost_page_dirty = 20		# 0-10000 credits
#vacuum_cost_limit = 200		# 0-10000 credits

# - Background writer -

#bgwriter_delay = 200			# 10-10000 milliseconds between rounds
#bgwriter_lru_percent = 1.0		# 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5		# 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333		# 0-100% of all buffers scanned/round
#bgwriter_all_maxpages = 5		# 0-1000 buffers max written/round


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = on				# turns forced synchronization on or off
#wal_sync_method = fsync		# the default is the first option 
					# supported by the operating system:
					#   open_datasync
					#   fdatasync
					#   fsync
					#   fsync_writethrough
					#   open_sync
#full_page_writes = on			# recover from partial page writes
#wal_buffers = 8			# min 4, 8KB each
#commit_delay = 0			# range 0-100000, in microseconds
#commit_siblings = 5			# range 1-1000

# - Checkpoints -

#checkpoint_segments = 3		# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300		# range 30-3600, in seconds
#checkpoint_warning = 30		# in seconds, 0 is off

# - Archiving -

#archive_command = ''			# command to use to archive a logfile 
					# segment


#---------------------------------------------------------------------------
# QUERY TUNING
#---------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

effective_cache_size = 1000		# typically 8KB each
#random_page_cost = 4			# units are one sequential page fetch 
					# cost
#cpu_tuple_cost = 0.01			# (same)
#cpu_index_tuple_cost = 0.001		# (same)
#cpu_operator_cost = 0.0025		# (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5			# range 1-10
#geqo_pool_size = 0			# selects default based on effort
#geqo_generations = 0			# selects default based on effort
#geqo_selection_bias = 2.0		# range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10		# range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8		# 1 disables collapsing of explicit 
					# JOINs


#---------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#---------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr'		# Valid values are combinations of 
					# stderr, syslog and eventlog, 
					# depending on platform.

# This is used when logging to stderr:
#redirect_stderr = off			# Enable capturing of stderr into log 
					# files

# These are only used if redirect_stderr is on:
#log_directory = 'pg_log'		# Directory where log files are written
					# Can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.
					# Can include strftime() escapes
#log_truncate_on_rotation = off # If on, any existing log file of the same 
					# name as the new log file will be
					# truncated rather than appended to. But
					# such truncation only occurs on
					# time-driven rotation, not on restarts
					# or size-driven rotation. Default is
					# off, meaning append to existing files
					# in all cases.
#log_rotation_age = 1440		# Automatic rotation of logfiles will 
					# happen after so many minutes.  0 to 
					# disable.
#log_rotation_size = 10240		# Automatic rotation of logfiles will 
					# happen after so many kilobytes of log
					# output.  0 to disable.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


# - When to Log -

#client_min_messages = notice		# Values, in order of decreasing detail:
					#   debug5
					#   debug4
					#   debug3
					#   debug2
					#   debug1
					#   log
					#   notice
					#   warning
					#   error

#log_min_messages = notice		# Values, in order of decreasing detail:
					#   debug5
					#   debug4
					#   debug3
					#   debug2
					#   debug1
					#   info
					#   notice
					#   warning
					#   error
					#   log
					#   fatal
					#   panic

#log_error_verbosity = default		# terse, default, or verbose messages

#log_min_error_statement = panic	# Values in order of increasing severity:
				 	#   debug5
					#   debug4
					#   debug3
					#   debug2
					#   debug1
				 	#   info
					#   notice
					#   warning
					#   error
					#   panic(off)
				 
#log_min_duration_statement = -1	# -1 is disabled, 0 logs all statements
					# and their durations, in milliseconds.

#silent_mode = off			# DO NOT USE without syslog or 
					# redirect_stderr

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_line_prefix = ''			# Special values:
					#   %u = user name
					#   %d = database name
					#   %r = remote host and port
					#   %h = remote host
					#   %p = PID
					#   %t = timestamp (no milliseconds)
					#   %m = timestamp with milliseconds
					#   %i = command tag
					#   %c = session id
					#   %l = session line number
					#   %s = session start timestamp
					#   %x = transaction id
					#   %q = stop here in non-session 
					#        processes
					#   %% = '%'
					# e.g. '<%u%%%d> '
#log_statement = 'none'			# none, mod, ddl, all
#log_hostname = off


#---------------------------------------------------------------------------
# RUNTIME STATISTICS
#---------------------------------------------------------------------------

# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off

# - Query/Index Statistics Collector -

#stats_start_collector = on
#stats_command_string = off
#stats_block_level = off
#stats_row_level = off
#stats_reset_on_server_start = off


#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

#autovacuum = off			# enable autovacuum subprocess?
#autovacuum_naptime = 60		# time between autovacuum runs, in secs
#autovacuum_vacuum_threshold = 1000	# min # of tuple updates before
					# vacuum
#autovacuum_analyze_threshold = 500	# min # of tuple updates before 
					# analyze
#autovacuum_vacuum_scale_factor = 0.4	# fraction of rel size before 
					# vacuum
#autovacuum_analyze_scale_factor = 0.2	# fraction of rel size before 
					# analyze
#autovacuum_vacuum_cost_delay = -1	# default vacuum cost delay for 
					# autovac, -1 means use 
					# vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for 
					# autovac, -1 means use
					# vacuum_cost_limit


#---------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#---------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '$user,public'		# schema names
#default_tablespace = ''		# a tablespace name, '' uses
					# the default
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#statement_timeout = 0			# 0 is disabled, in milliseconds

# - Locale and Formatting -

#datestyle = 'iso, mdy'
#timezone = unknown			# actually, defaults to TZ 
					# environment setting
#australian_timezones = off
#extra_float_digits = 0			# min -15, max 2
#client_encoding = sql_ascii		# actually, defaults to database
					# encoding

# These settings are initialized by initdb -- they might be changed
#lc_messages = 'C'			# locale for system error message 
					# strings
#lc_monetary = 'C'			# locale for monetary formatting
#lc_numeric = 'C'			# locale for number formatting
#lc_time = 'C'				# locale for time formatting

# - Other Defaults -

#explain_pretty_print = on
#dynamic_library_path = '$libdir'


#---------------------------------------------------------------------------
# LOCK MANAGEMENT
#---------------------------------------------------------------------------

#deadlock_timeout = 1000		# in milliseconds
#max_locks_per_transaction = 64		# min 10
# note: each lock table slot uses ~220 bytes of shared memory, and there are
# max_locks_per_transaction * (max_connections + max_prepared_transactions)
# lock table slots.


#---------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#---------------------------------------------------------------------------

# - Previous Postgres Versions -

#add_missing_from = off
#regex_flavor = advanced		# advanced, extended, or basic
#sql_inheritance = on
#default_with_oids = off
#escape_string_warning = off

# - Other Platforms & Clients -

#transform_null_equals = off


#---------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#---------------------------------------------------------------------------

#custom_variable_classes = ''		# list of custom variable class names
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux