Hi there,
i am relying on PgSQL's template system for most activities by using the
command "CREATE DATABASE newdb WITH TEMPLATE oldb STRATEGY FILE_COPY" to
clone big databases.
I have noticed the FILE_COPY strategy works best in my scenario, as
pointed out in the docs, but it still gives me quite long wait times.
By peeking into the source code, i've also noticed that the copy
operation takes place in src/backend/storage/file/copydir.c; there the
copy_file() function allocates a buffer to copy file data.
Most modern CoW filesystems support "reflinks" (fclonefileat syscall
usually) allowing a file to be cloned almost instantly without the need
to physically duplicate data. This behavior would be beneficial both for
performance (clone speed) and space efficiency.
I see the pg_upgrade tool already implements this strategy via the
"--clone" switch. Do you think it would be hard or beneficial to make it
available also in the main daemon?
In Posix systems, the standard file copy "cp" command switched its
default behavior in 2021 (coreutils 9+) to always try to copy files by
using reflinks first, so it should be a safe approach, it then
automatically falls back to the old behavior if reflinking can't take place.
What are your toughs? Could this fit in a feature request? If yes, how
can i structure it correctly? I've seen bug reporting guidelines but not
FR guidelines.
Thanks,
Gabriele Tozzi
--
GPG Key Fingerprint:
DAD1 E3E3 C3E9 36FB C570 F405 9B5F 7108 A1D0 2FFF