Mike McGrath wrote: >Can anyone give a brief architectural overview? (having not looked at it >at all I was wondering stuff like:) For the storage part (Swift) the authoritative source is http://swift.openstack.org/overview_architecture.html or various files in the docs/source subdir of the tarball, but I'll try to provide a brief overview. The storage model is very similar to Amazon's S3 - HTTP interface, single-level hierarchy of containers (S3 buckets) and objects, mostly whole-object get/put, some support for attributes and ACLs on objects, etc. The main difference that I've found is that Swift requires the API user to obtain a token through a (semi-)separate auth service and then use that same token for all subsequent requests, while S3 does its own per-request auth. Internally, the whole thing is based on consistent hashing, but it's consistent hashing based on partitions rather than hashing directly from item to server. IOW, the item is hashed to a partition rather than a server, and the assignment of partitions to (N-way-replicated) servers is done offline via a "ring-builder" utility. There are separate rings and partition/server sets for objects, containers, and accounts, with each server using its own internal sqlite3 database to store metadata and plain files for data. There are also one or more HTTP proxy servers, based on WebOb and eventlet, which provide API service. Incoming objects are hashed to the appropriate partition, the replica servers looked up in the global partition/server map, and then the object contents directly streamed (I confirmed this with the devs) to the N object servers which will hold copies. There are also some background processes to do re-replication, auditing, etc. It's all Python, and comes with a semi-decent set of unit and functional tests which could also be used for other similar data stores. _______________________________________________ cloud mailing list cloud@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/cloud