> Why? Rsnapshot and AMANDA are already available, stable, and quite robust. They seem to cover just the niches you're aiming at, and the performance and security ramifications have already been worked out. Plus, neither requires an SQL database, which makes them much more robust. > > It's nothing personal or criticizing your code. I just wind up cleaning up after a lot of projects that reinvent the wheel. Backup systems are a popular such target. Good questions. Amanda and Bacula both are good full-featured backup tools, however they take a bit to set up and get working. I was previously using something similar to Rsnapshot, but wanted compression, and better file-level deduplication (Rsnapshot will de-duplicate files that are the same between backup sets, but not files that are the same from multiple hosts, or different copies of files on the same host). And the dedup method it uses (hard links) can make for a very large set of files to analyse on the backup media (in my case, backing up 20 odd hosts, keeping daily/weekly/monthly backups ended up with 600 total backups on the storage unit, and about 300 million files. Which made for keeping an rsync'd copy of the backup drive a bit tedious (not to mention running a "find" command to get stats about the files, such as the largest ones or which were unique, to improve the include/exclude list). This is where the SQL database comes into play -- I can easily get the top space-consuming unique files/directories, etc. BTW, only the file listings / metadata are included in the DB -- the contents are stored as lzop compressed files in the filesystem. And since the SHA1 hash is used as the file name, you get full file-level deduplication across multiple hosts for free. Other systems I looked at were Backuppc, which seemed like it was a bit more complicated (both setup and internal workings -- couldn't find how to execute pre/post backup scripts for live DB backups, etc), Obnam (stores backups in an internal binary format, has issues with a daily/weekly/monthly retention schedule), and a few others that were still based on the rsync/hardlinks method. BTW, here's a list of what my main design goals were, compared to other tools. * Keep the backend data format as transparent as possible (files are stored in lzop-compatible format, and the data catalog is in an sqlite database). Didn't want the data to be in a big blob like some of the non-rsync based Linux backup tools. * Use the database for identifying duplication, instead of hard links in the filesystem (when using an rsync backup with hard links, I couldn't in turn rsync the entire backup drive to another volume in any reasonable timeframe). * Use tools that are already on the clients so there is no agent to install (find, tar). * Advanced features, such as preserving selinux attributes, ACLs, and handle sparse files properly. * Have the metadata in a database that can be queried with SQL syntax for reporting and diagnostic purposes. I still owe documentation on the DB schema layout though. * Easy to get up and running (install, configure a couple lines to point to the backup media and run), while retaining flexibility in setting up a client/server installation. And thank you for your feedback, if there is anything that the above doesn't cover let me know (or if you think there needs to be more documentation, additional features [without adding too much complexity]). If the SQlite DB is a concern, there is also an import/export function to dump the metadata in a tab-delimited text file. Summary: Snebu fills a gap between simple, and more full-featured flexible systems, without being overly complex or opaque. -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct