Re: Why are some WAL files in pg_xlog symlinks to old files?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 29, 2010 at 10:15 AM, Nigel <nigelspleen@xxxxxxxxx> wrote:
> Hello,
>
> We're running PG 8.3 in a warm standby configuration.  About 3 weeks ago we
> had to fail over from the primary to the standby.  That worked fine, but
> we're having problems getting standby mode set up again.  On the new
> standby, everything works fine for a little while: WALs were rsynced over
> and processed correctly as far as I can tell.  But every 65-75 minutes (very
> regularly), a WAL file is copied that's actually a symlink.  When the
> standby tries to read the rsynced symlink, it hangs indefinitely, presumably
> because the target of the link doesn't exist on the standby.
>
> In the primary's pg_xlog, I see the expected WAL files with increasing
> numbers and recent modification dates, but every 65-75 files there's one of
> these symlinks. For example:
>
> Sep 28 16:13 0000000300000A5C00000070
> Sep 28 16:15 0000000300000A5C00000071
> Sep 28 16:12 0000000300000A5C00000072
> Sep  5 01:00 0000000300000A5C00000073 ->
> /srv/db/chdbprod_wal_archives/00000001000009D6000000D6
> Sep 28 16:21 0000000300000A5C00000074
> Sep 28 16:19 0000000300000A5C00000075
>
> The "/srv/db/chdbprod_wal_archives" directory is where incoming WAL files
> used to go, back when the current primary server was the standby.  The
> September 5 date you see above is shortly before the failover was done.  It
> confused me at first until I remembered that it's the mod date of the target
> of the symlink, not the link itself (which in this case was presumably
> created around 16:20).  The target of the symlinks is always the same.
>
> pg_xlog also contains a 00000003.history file, which references the target
> of the symlinks.  Here's its contents:
>
> 1       00000001000009D6000000D6        before transaction 0 at 2000-01-01
> 00:00:00+00
>
> I gather that my problems here are due to having a primary server that was
> itself formerly a standby, but I'm not sure what action to take.  I don't
> know enough about how the history files work and what the significance of
> the symlinks is.  What purpose to the symlinks serve?  Why are they
> recreated regularly at slighly more than hourly intervals?  Why do they
> point to a directory that was only used back when the primary was a
> standby?  (If it makes any difference, back when the primary server was a
> standby, it was running pg_standby with the -l option.)  Does their presence
> mean that something's wrong on the primary, or should they be ignored when
> copying to the standby?

I guess that the cause is -l option. The symlink to the archived WAL file is
created in pg_xlog by "pg_standby -l". At the failover, unfortunately that
symlink in pg_xlog is renamed to the new for WAL recycling. Then, the symlink
to old archived WAL file remains in pg_xlog.

AFAIR, because of this problem, -l option was removed from pg_standby.
http://archives.postgresql.org/pgsql-committers/2009-06/msg00323.php

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux