> To follow up with Ted, nobody said using the filesystem is bad, No, it is the most efficient way.
Not in my environment. All db servers have RAID 10 over 8 SCSI 15K disks. Pulling from them is always faster than a webserver pulling from its SATA drive. Also, there is issue of going to the database and going to the filesystem. Timing wise, you
Add an image server (or 20) and change the HTML to point to the image server.
I can't imagine Flickr doing that.
I believe that the only "new" thing I have to add is for newbies. I believe that for a newbie, it would be easier to use the filesystem rather than the DB. True, you then have to do some extra cleanup/management work for deleted records, so that the related images go away. But storing them in the DB invariably ends up with too many issues involving DB storage size and query buffer size, compounded by data escaping/security issues.
Strange... I came to the opposite conclusion. Using prepared statements eliminates data escaping issues, etc. And putting the files in the db removes the extra cleanup/management stuff. And easier to backup (though not efficient).
Again, the problem of replication or distribution does not require a database. If you are saying that your single database will contain all your bitmap files, then that's messed up and your database will be a bottleneck. You've stated a problem: A large amount of data spread across multiple machines, this is a real problem domain, but it absolutely does not say why a database is the right solution or even a solution at all.
I guess you skimmed what I wrote. What I wrote was about using a database for meta data and server and file location. I was talking about using that date to intelligently know where the file was .... on a file system. Mostly because it is cheaper to scale that way as you can tune things to add more replicas for redundancy and performance. Then that can hit scalability problems with many hundred of servers. But easy to solve by breaking down meta data and storage parts. Have fewer storage servers talk to a meta data (database) server such that you can run the databases on the same cheap machines that you run the storage stuff. Then you have a task server that manages the meta data servers and storage servers. Chaining things this may see like a lot of steps to go through. But it can be very efficient throughput wise, which matters far more than a benchmark. To anyone that has designed CPUs, this will look a little familiar (though more flexible). At that point you also never have a complete backup on one machine. I remember that being a weird thing.... -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php