On 07/10/2013 07:01 AM, Allan Latham wrote: > I have a simple scenario and it just simply doesn't work. Reading over > the network when the file is available locally is plainly wrong. Our > application cannot take the performance hit nor the extra network traffic. Another victim of our release process. :( Code was added to choose the local subvolume whenever possible in *June 2012* (commit 0baa65b6). Further fixes and related changes, including a user-submitted patch to force this choice for sites with more complex needs, have gone in since then. None of them have made it into a release yet, since 3.4 is still in beta and the changes have not been backported into 3.3.anything (including 3.3.1 which I see you were using). All I can offer is an apology. > 1. get a simple minimalist configuration working - 2 hosts and > replication only. > 2. make it bomb-proof. > 2a. it must cope with network failures, random reboots etc. > 2b. if it stops it has to auto-recover quickly. > 2c. if it can't it needs thorough documentation and adequate logs so a > reasonable sysop can rescue it. This is one of my own pet peeves. I will personally be working on the internals documentation soon, so users will at least have a chance of understanding what the often-cryptic log messages really mean. Improvements to logging, event reporting, and so on are also ongoing, albeit slowly and not under my direct purview. > 2d. it needs a fast validation scanner which verifies that data is where > it should be and is identical everywhere (md5sum). How fast is fast? What would be an acceptable time for such a scan on a volume containing (let's say) ten million files? > 3. make it efficient (read local whenever possible - use rsync > techniques - remove scalability obstacles so it doesn't get > exponentially slower as more files are replicated) Can you explain "exponentially"? The time for a full scan should increase *linearly* with number of files. That's bad enough, and it's why we're starting to get away from reliance on full scans in favor of logging or journaling approaches, but if you're seeing exponential behavior then something is amiss. > 4. when that works expand to multiple hosts and clever distribution > techniques. That would be a fine sentiment for a new project, but it's not really an option when there are already thousands of users relying on the "clever distribution techniques" and many other features in production. We do have to fix their bugs too, so we can't devote all of our resources to improving or reimplementing replication. Believe me, I wish we could. Thank you for your constructive feedback. I hope that we can use it to make things better for everyone.