On Thu, Nov 24, 2011 at 12:31 PM, Xiaofei Du <xiaofei.du008@xxxxxxxxx> wrote: > This means no matter how many clients I have I should always get > around 79MB/s, right? This sounds reasonable. Thanks for the > explanation. So do you guys have plans to solve this "unbalanced > cluster" problem? Eventually. Right now Ceph is pretty config-heavy, unfortunately. So like I said, you can change the weights of the slow nodes — this will map less of the data to them, so they have fewer writes and reads. But even then you're going to be stuck pretty low due to your disk write bandwidth. On a modern disk that doesn't matter so much since they can push two simultaneous 50MB/s+ streams (ie, one 50MB/s journal and one 50MB/s data store), but even so we generally recommend separate spindles for the journal. > I guess several other distributed file systems have > the same issue. HDFS has this issue too. I guess the solution is to > use stable disk IO bandwidth hardware. If that couldn't be guaranteed, > you need to detect slow nodes and kick them out of the cluster. I'm not sure how other systems handle it — many don't, HDFS might or might not. But as I look at the description of how HDFS replicates data it looks to me like it matters less. http://hadoop.apache.org/common/docs/current/hdfs_design.html#Robustness indicates that all data is written to a client-local file, then (asynchronously from the client's write) it is written out to the first DataNode, which copies the data to the second, which copies to the third, etc. But in this scheme the file doesn't need to be fully replicated to each DataNode for the client to close the file and consider it data-safe. These are the appropriate set of data consistency choices for HDFS, but Ceph is designed to satisfy consistency requirements for a much more stringent set of needs. :) > >> Are you using ceph-fuse or the kernel client? And if it's the kernel >> client, what version? > I am using ceph-fuse Bah humbug. :( I've created http://tracker.newdream.net/issues/1752 to keep track of this issue; it'll be properly prioritized next week. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html