Did you do a detailed du during the supposed problem and after the reboot and make a diff of those
to fimd any invlolved files/dirs?
That said, i think you might consider posting on freebsd-[questions|stable] as well.
On Τετ 20 Μαρ 2013 11:49:07 Dan Thomas wrote:
Hi Guys,
We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking" quite significant amounts of disk space:
> df -h /usr/local/pgsql/ Filesystem Size Used Avail Capacity Mounted on /dev/mfid1s1d 1.1T 772G 222G 78% /usr/local/pgsql
> du -sh /usr/local/pgsql/ 741G /usr/local/pgsql/
Stopping Postgres doesn't fix it, but rebooting does which points at the OS rather than PG to me. However, the leak is only apparent in the dedicated pgsql partition, and only on our database servers, so PostgreSQL seems to at least be involved. The partition itself is a relatively standard UFS partition:
> grep /usr/local/pgsql /etc/fstab /dev/mfid1s1d /usr/local/pgsql ufs rw 2 2
> tunefs -p /usr/local/pgsql/ tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) disabled tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time tunefs: volume label: (-L)
LSOF isn't showing any open files:
> lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l 0
We're not creating filesystem snapshots:
> find /usr/local/pgsql/ -flags snapshot >
Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions:
FreeBSD PostgreSQL Leaking? 8.0 8.4.4 no 8.2 9.0.4 no 8.3 9.1.4 yes 8.3 9.2.3 yes 9.1 9.2.3 yes
Each of these servers is configured with a warm standby, so we've been switching them over to the standby to reclaim the space (rebooting the primary is too much downtime). The standby does *not* demonstrate this problem while it's being used as a standby, but it starts leaking space once it's been made the primary.
Initially I thought this might be related to WAL files, however the pg_xlog dir is symlinked outside of the /usr/local/pgsql partition that is demonstrating this problem:
> ll /usr/local/pgsql/data/pg_xlog lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/
I've exhausted everything I can think of to try to solve this one. Has anyone got any ideas on how to go about debugging this?
Thanks,
Dan
-
Achilleas Mantzios
IT DEV
IT DEPT
Dynacom Tankers Mgmt |