Hi folks, A few weeks ago I started a php project for building a highly scalable distributed file system similar to mogileFS. Well, I did just that and called it PHPDFS and I just completed a test of PHPDFS using 20 ec2 instances and PHPDFS performed quite well. I setup 20 (5 clients, 15 servers) m1.large amazon instances and uploaded 250GB of data and downloaded 1.5 terabytes of data. Total overall transfer was 1.8 TB The blog has more info and links to 870 graphs and the 300mb of data that was collected and analyzed.: http://phpdfs.blogspot.com Here are some highlights: - 500 threads (5 java clients, 100 threads each) - 15 servers - ~250 GB uploaded (PUT requests) (individual files between 50k and 10mb) - ~1.5 Tb downloaded (GET requests) - ~1.8 TB transfer total - ~47mb / sec upload rate - ~201 requests / sec overall - The data was very evenly distributed across all nodes - 40 replicas were lost and totally unrecoverable amounting to .030% data loss Basically, PHPDFS performed quite well, There was a small amount of data loss due to one of the servers getting really hot. What happened was a few uploads to the hot server were corrupted and the corrupted objects were replicated. Better error handling and a checksum mechanism will eliminate something like that from happening. So that is next on the development list. Anyway, I just wanted to let the list know that this is coming along quite nicely and actually has gotten some attention from the folks at the Storage Systems Research Center at UC Santa Cruz. http://www.ssrc.ucsc.edu/ if anyone has questions or wants to get involved please let me know. you can download phpdfs here: http://code.google.com/p/phpdfs/downloads/list peace, -Shane