Al 06/09/13 20:43, En/na Anand Avati ha
escrit:
As I understand it, what the read-ahead translator does is to collect one or more answers from the DHT translator and combine them to return a single answer as big as possible. If that is correct, it will certainly reduce the number of readdir calls from application, however I think it will still have a considerable latency when used on big clusters. Anyway I don't have any measurement or valid argument to support this, so lets see how readdir-ahead works in real environments before discussing about it. I've seen customers with large amounts of empty, or almost empty, directories. Don't ask me why, I don't understand it either... I've made the tests in 6 physical servers (Quad Atom D525 1.8 GHz. These are the only servers I can use regularly to do tests) connected through a dedicated 1 Gbit switch. Bricks are stored in 1TB SATA disks with ZFS. One of the servers was also used as a client to do the tests. Initially I created a volume with a single brick. I initialized the volume with 50 directories with 50 subdirectories each (a total of 2500 directories). No files. After each test, I added a new brick and started a rebalance. Once the rebalance was completed, I umounted and stopped the volume and restarted it again. The test consisted of 4 'time ls -lR /<testdir> | wc -l'. The first result was discarded. The result shown below is the mean of the other 3 results. 1 brick: 11.8 seconds 2 bricks: 19.0 seconds 3 bricks: 23.8 seconds 4 bricks: 29.8 seconds 5 bricks: 34.6 seconds 6 bricks: 41.0 seconds 12 bricks (2 bricks on each server): 78.5 seconds The rebalancing time also grew considerably (these times are the result of a single rebalance. They might not be very accurate): From 1 to 2 bricks: 91 seconds From 2 to 3 bricks: 102 seconds From 3 to 4 bricks: 119 seconds From 4 to 5 bricks: 138 seconds From 5 to 6 bricks: 151 seconds From 6 to 12 bricks: 259 seconds The number of disk IOPS didn't exceed 40 in any server in any case. The network bandwidth didn't go beyond 6 Mbits/s between any pair of servers and none of them reached 100% core usage. Xavi
|