On Wed, Jan 21, 2015 at 10:02 PM, David G Johnston <david.g.johnston@xxxxxxxxx> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.
This entire sentence doesn't make sense to me. How does one "follow" a
hard-link? A soft-link yes but a hard-link is an alias to actual data. I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...
See the man page for rsync, the -H option, which explains it better:
-H, --hard-links
This tells rsync to look for hard-linked files in the transfer
and link together the corresponding files on the receiving side.
Without this option, hard-linked files in the transfer are
treated as though they were separate files.
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.
And how, exactly, did you perform the pg_upgrade. As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version). I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited. But what you have shown us
to this point is far from conclusive.
I'm pretty sure I understand how this happened, but it's speculation.
This database live in /data/postgres-9.3, but PGDATA points to /postgres, which is a symbolic link to /data/postgres, which is a symbolic link to postgres-9.3. The tablespace are all in /data/postgres-9.3/tablespaces, but in the pg_tblspc directory, it's symbolic links to /postgres/tablespaces (which in fact resolve correctly), for example:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
Normally when pg_upgrade runs, you end up with two parallel directory hierarchies, and $PGDATA points to the new one when you're done. But because of the way our symbolic links work, both the new and the old directories are in the /data/postgres-9.3/tablespaces directory. You can't simply delete the old $PGDATA directory, because that would erase the entire database.
I'll have to dig around to prove to myself that this is the case.
Craig