Re: Backup compaction optimization in a block-level replication environment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Further progress report: with small chunks, compaction takes about 15 times longer.  It's almost as if there is an O(n^2) complexity somewhere, looking at the rate that the disk file grows.  (Running perf on a compaction suggests that 90% of the time ctl_backups is doing compression, decompression, or calculating SHA1 hashes.) So I'm going back to large-ish chunks again.  Current values:

backup_compact_minsize: 1024
backup_compact_maxsize: 65536
backup_compact_work_threshold: 10

The compression ratio was hardly any different (less than 1%) with many small chunks compared with huge chunks.

Setting the work threshold to a number greater than 1 is only helping a bit.  I think that the huge disparity between my smaller and larger user backups is hurting me here.  Whatever I set the threshold to, it is going to be simultaneously too large for most users, and too small for the huge %SHARED user.

Confession time: having inspected the source of ctl_backups, I admit to misunderstanding what happens to chunks when compaction is triggered.  I thought that each chunk was examined, and either the chunk is compacted, or it is not (and the bytes in the chunk are copied from old to new unchanged).  But compaction happens to the entire file: every chunk in turn is inflated to /tmp and then deflated again from /tmp, minus any messages that may have expired, so the likelihood of the compressed byte stream being the same is slim.  That will confound the rsync rolling checksum algorithm and the entire backup file will likely have to be transmitted again.

With that in mind I've decided that I'll make compaction a weekend-only task, take it out of cyrus.conf EVENTS and put a weekly cron/systemd job in place.  During the week backups will be append-only, to keep rsync happy.  At weekends, compaction will combine the last week of small chunks, and I've got all weekend to transmit the hundred GB of backup files offsite.

--
Deborah Pickett
System Administrator
Polyfoam Australia Pty Ltd

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus




[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux