--- Luke McGregor <luke@xxxxxxxxxxxxxxx> wrote: > This could cause some serious problems especially > on a hevially accessed file. The problem would i > believe be worsened as the nodes which are hosting > any hevially accessed file are the most likely to > not respond quickly to any kind of multicast. I think that this problem is one that is exagerated. Although it is possible that a heavily accessed node has a file on it that is needed and this node may end being the slowest node to respond, this situation is no worse off than if you did not migrate/replicate files is it? Let's look at the problem this way, current unify (scenario 1): Node A Node B Node C file A If file A is on Node A and it is being heavily accessed by Node B, Node A will be heavily accessed also right? This means that when Node C requests file A it will still be contending with Node B, so node A may be slow to respond. Fast forward to a migration scenario (2). Node B is heavily accessing file A, it gets duplicated to Node B. Node A Node B Node C file A file A When Node C comes along and requests file A it may have to wait longer for Node B to respond to a meta data query, buy this should probably be shorter than a whole file read from Node A in scenario 1, wouldn't it? Overall you have still potentially drastically improved both the system latency and throughput in scenario 2 over scenario 1. I really don't think that the fact that Node B is heavily accessed that it is will make your solution slower, it just becomes another potential contention point after having scaled up already quite with migration/caching. In order for this to become your blocking point, it also requires that Node B be heavily loaded and that it not be accessing file A!! If Node B were accessing file A, it would still be a drag on the accessibility of file A to Node C so adding a quorum solution may not even help in this case. However, if Node B becomes heavily loaded and is no longer accessing file A, then again your migration solution will kick in and file A should migrate to where it is actually being accessed, potentially Node C! Perhaps the lesson here is that having the file on fewer nodes and only the nodes that are actually accessing it is potentially better for latency!! If latency becomes an issue than perhaps a heavy bias towards migration should be considered? And perhaps even a heavier bias towards flushing files from loaded nodes if that file is not being accessed on the loaded node! All in all, I think that enhancing your migration/flushing heuristics may be a better way to deal with this latency than any centralized meta data solution. Cheers, -Martin