Den lör 18 juli 2020 kl 02:18 skrev <DHilsbos@xxxxxxxxxxxxxx>: > Daniel; > As I said, I don't actually KNOW most of this. > Seems correct in my view though. > As such, what I laid out was conceptual. > Ceph would need to be implemented to perform these operations in parallel, > or not. Conceptually, those areas where operations can be parallelized, > making them parallel would improve wall clock performance in 80% - 90% of > cases, thus making this configurable wouldn't make sense. > > That said, I don't know which route the developers went. > All I know is that the client transfers each chunk to the master for its > PG, and the master sends it on to the replicas. > I suspect that replicas must acknowledge the chunk before the master > finishes the synchronous operation. > I suspect that all replicas are transferred (from the master) in parallel. > It probably is parallel, but if you can max out your network with the traffic, then there will still be waiting from the master before all replicas get their data. In this case it seems it was repl=2, so that is not an issue, but if you had 1GE and repl > 2 I'm sure you'd notice how network would make those transfers feel very non-parallel. ;) > Given the maturity of Ceph, I suspect this has already been done, unless > the developers ran into a significant issue, but I don't know. > One of the things to consider is that if you do one single stream, you are basically not testing what a cluster can do, but only what the absolute smallest setup does. If you have 100s of consumers talking to your cluster, they can't all just fire off a single copy and then immediately proceed and send the next while all the background traffic still needs to happen, times 100. Well, you can, but you will still see the lower "real" bandwidth per-client at that point. Going async is the same as writing to RAM buffers or the small WAL/journal on a faster drive and so on. It helps with a short temporary spike if you didn't have anything running at the same time, but it doesn't really reflect the true capacity in the cluster for a single node, nor will a single test show the total capacity of a ceph cluster either, since that will be the sum of all (lower) single-client speeds. It will only show what perf you can get until your RAM buffer or WAL/journal runs out of capacity so you aren't really benching the correct thing. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx