--- gordan@xxxxxxxxxx wrote: > On Tue, 20 May 2008, Martin Fick wrote: > > IMO, separation of issues should only happen when > the entire design is worked out. Otherwise you end up > with two half-solutions which either don't integrate > properly or don't make a whole when they're put > together. Perhaps that is true as witnessed by the current separate unify/AFR translators. Nevertheless, I think that my point holds, this thread started out by talking about improving performance by migration and has turned into an attempt to solve many completely different although valuable objectives. While I am not suggesting to not think ahead about how multiple problems could be solved with certain solutions, I do think that it is important to try and simplify your objectives when possible. In Luke's case I think that it can be simplified. Luke came to the table talking about a simple design modification that would allow a file to migrate from node to node. As a big believer in incremental changes, I see no inherent flaw with a simple improvement to the unify translators that would simply migrate files to where they are heavily accessed without duplicating copies of the file. I am not suggesting that this is the last possible improvement in performance that can be achieved, but rather that it would be potentially a good effective simple start. If he succeeds at this I think that it would provide a valuable starting point for further analysis. This simple task alone seems like it could benefit from much analysis to figure out what would be good migration strategies. Once it is possible to migrate a file it becomes important to decide when to migrate that file. I would think that it might even be possible to make this controlled by the various current schedulers using the same strategies that each one currently specializes in. ... > Are you saying that files should be migrated without > duplication? I guess that _could_ be done with just a > unify translator mod, but it'd require some > heuristics to check what nodes request which > files most frequently, and then migrate based on > that. Exactly what I am suggesting (and what I thought Luke was originally suggesting). > This may be insufficient, however, if there are > multiple nodes that could greatly benefit from > having the file locally, and which point you have no > choice but to consider a more complex > translator that has to include AFR functionality on > a per-file basis. Well, let's think about this specific problem a little closer. If a file is accessed from both nodes heavily they would indeed benefit from having it locally. The current caching translator should greatly help in this case if the concurrent accesses are read accesses. If the concurrent accesses are write accesses, neither distributing the file nor the caching translator will actually help! So what have you gained by distributing it, probably nothing except for added complexity? Again, Luke suggested improving the current unify translator with migration. Do you think that this would improve it? The answer I suspect has to do with your workload, thus the suggestion for multiple migration schedulers. Surely some workloads will benefit from this. If you have workloads (even suggested imaginary ones) that will not be improved by this strategy, by all means, bring them up. If you have workloads that will actually be hurt by the strategy, of course, bring those up first! I suspect that some workloads may indeed be hurt by this and may result in "migration thrashing" where a file keeps bouncing from node to node -> back to designing good schedulers with good heuristics. > > The simplest migration solution is to not tolerate > > downed nodes! If you do not tolerate them, you do > > not have locking/split brain types of issues to > > resolve. > > This would be the "no redundancy" solution. As I > said above, that could be done with just a unify > translator mod (unify can't handle downed nodes > without data loss, either). But that assumes only > one node ever intensively accesses one file, which I > don't think is likely to be a typical case. Again, the question is, would this solution be an improvement over the current unify translator? If we picture two node intensively accessing one file with the current scheduler and the single copy migration scheduler, which will work better? I think that in this simplistic scenario the migration scheduler will work better or the same. This scenario breaks down to two possible cases. Current unify A) and file is not on either of the two contending nodes B) file is on one of the contending nodes. A) Node X Node A Node B File X B) Node A Node B File X It would seem to me that the migration scenario would attempt to mostly mimic case B. This means that it would perform similar to the regular translator in case B, but perhaps better than the regular translator in case A (thus the attempt to mimic case B) Maybe in some rare workloads it would work worse than case A, when Node A is heavily loaded but not accessing File X very much making it hard for Node B to get File X? Of course, this makes it a good candidate to migrate to Node B then. As long as the scheduler can identify this case I would think that it should outperform the non migration scheduler. Again, this seems like a fairly valuable line of research with enough complexity to it that will provide lots of good experience with the problem domain to help any further improvements get designed. I can't tell Luke what to do, but I would think that taking things much further than this for now would be over-designing to an not well enough understood problem, but that's just me. :) > > Simply migrate a file where it is needed and never > > leave a copy behind where it can get out of sync. > > If you want HA, install AFR under each subvolume. > > If you want to solve split brain issues with AFR (I > > hope we can,) start another thread. :) Once AFR > > split brain issues are resolved in glusterfs, > > merging AFR and Luke's potential merging unify > > translator should be a much easier and well defined > > task! > > I don't think the problems are as separable as you > are implying, at least not with making compromises in > both parts that cannot be made up by stacking the > two. Well, they are separated by the current designs of glusterfs. Whether this is good or bad is an entirely separate issue. And whether there is benefit to be gained by merging the design, great question, there might be eventually, but right now each of these designs could benefit from many incremental (and even some major) design improvements that would make them better. If there is ever a good time to think about merging them, I do not think that it is now. To start with, many of the issues that you have brought up are really AFR split brain type issues. Of course, I can't stop you from designing, :) but I would hope that energy would be spent on solving these issues with AFR first where the problem domain is a little smaller. Just my highly opinionated and stubborn 2 cents, ;) -Martin