Re: How to make a block-level incremental backup using LVM?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


The biggest difficulty in answering your question is that you asked about a specific method to solve a general problem, without specifying any other requirements. In particular, in order to give you good answers we need to know whether you need only single near-line backups, or whether you need multiple snapshots, or whether you need those and also off-site backups on tape or on replicated systems.

The short answer is that replication can get you redundancy (which is not a backup), rsnapshot can often do a good job of getting very space-efficient online snapshots, and that Bacula Enterprise is an excellent option for backing up data to removable data like tape. You can probably get an ideal backup solution using some combination of those systems. Finding that combination is the complicated part.

Each of those things solve some portion of the problem that you're asking about. First, you asked about block level change tracking. That's exactly what replication requires, and you'll find that a replicated filesystem provides exactly that: they track block changes and can efficiently transfer those blocks to a remote system. Alan mentioned ceph, and that's probably a great solution. Your production systems, local and remote, should have their data on a replicated volume where at least one of the replicas is your backup server. One backup server can serve as the replica of all of the volumes with data in all of your production systems. Once that's in place, until the backup storage array fails, you'll never have to transfer a full backup again. Whether you back up to rsnapshot or removable media, you'll back up from the local filesystem and you've eliminated the network as a bottleneck for backups.

If you only need online backups, you may be able to get that with rsnapshot. rsnapshot is fairly good when you're not dealing with very large files (such as databases). If you combine that with ceph, your backup system will need one volume to replicate your production data and a second volume to back it up. At this point, you'll have eliminated the network bottleneck at a cost of more disk storage (which is fairly cheap, compared to the cost of increasing the speed of the network).

If you need offline backups such as tape, bacula can also back up from that locally replicated volume. Bacula Enterprise can provide the other bits you mentioned wanting: a web dashboard and easier configuration/management.

A few more notes follow:

On 12/14/2012 04:42 AM, Fernando Lozano wrote:
We already have a few TB on file shares (Samba) and mailboxes (Zimba)
and just moving those bits around for our weekly full backup is proving
to be too slow for our Gb network and impossible for the hosted machines
we use as contingency and off-site backup . Beisdes, incremental backups
are taking a too long time just scanning the file systems searching for
changed files.

If scanning your filesystem takes too long, your storage array is probably too slow. Consider using RAID10 instead of RAID5/6. Consider using SSDs instead of hard drives. Consider using a fast additional drive or array as your ext4 journal.

Sory for the long story, the question: could I implement block-level
backups using dump, dd, and some LVM or ext utility? Maybe using
inotify? Why no open source backup tool seems to be doing this?

Mostly because inotify only allows you to track which files are changed, and only for files that are changed while the tracking daemon is running. OS X does something very much like this for Time Machine: a small daemon logs the files from a kernel notification. The kernel keeps a small notification queue (which Linux does not, as far as I know), so that if the daemon stops and for files that are modified during the boot sequence before that tracking daemon starts, the tracking daemon can still keep a log which Time Machine will back up. If one of those components detects that the tracking daemon may have missed kernel notices, the system falls back to a full scan.

It's not a very complicated system, and could be duplicated fairly simply under Linux, but you'd fall back to full scans much more often since (again, as far as I know) there's no kernel notification queue, so a full scan would be required every time the tracking daemon starts. It doesn't have to wait on the start of a backup, however. The tracking daemon could do the crawl as soon as it starts.

Would any option allow me to restore an individual file?

Virtually every option does. The only case in which you can't restore an individual file is when you replicate a volume to a system that doesn't understand the filesystem/volume contents.

users mailing list
To unsubscribe or change subscription options:
Have a question? Ask away:

[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux