On Wed, 11 May 2016 11:30:00 -0500 Dennis Gilmore <dennis@xxxxxxxx> wrote: > I am trying to catch up with email from the last week or so, I am > still 13000 behind, So I did not catch this. this only works if you > do not care about hardlinking, which is going to mean that people are > using an extra 500G + of disk on the mirrors. An issue some mirrors > have hit due to what I am assuming are bad mirroring practices. the > only way to fix it properly is going to mean re-evaluating how we > push content and how we message the pushing, and having tooling to > either do push mirroring or enabling intelligent pull based > mirroring, including information about whats hardlinked where and > what content we have pushed. this is like a bandaid when the sore > under it is still festering away. Well, this change was simply to allow us to explore using more data for syncing. Hopefully we can come up with a way to express hardlinks with it. If you have the fullfiletimelist file and there is a new one you can diff them. Once you have that list of files that were deleted or changed, you can sort them and possibly hard link the ones with the same name/timestamp/size. All of our hardlinked files should be the same name/timestamp/size I think. But failing all that we could easily have people rsync just the changed files (saving us LOTS AND LOTS of iops), but not getting hardlinks and then once a week or two doing a full traditional sync that would delete any removed files and hardlink everything. Doing this they would not have an extra 500GB, they would only get back those files changed in the last week that were hardlinked, so it would be much smaller I suspect. This would save us tons of iops, make their syncs super fast and only have a slight bad effect on space. If this all turns out to not work out, no harm done, but I think it might well help us out a great deal. kevin
Attachment:
pgpJ7OmonzQvQ.pgp
Description: OpenPGP digital signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx http://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx