On Mon, Apr 4, 2011 at 4:33 AM, Rutger ter Borg <rutger@xxxxxxxxxxx> wrote: > > Hello, > > I'm in the progress of evaluating librados as an object store. I'm using > Debian's latest packages as of today, and noted that I need to define > NO_ATOMIC_OPS to get something compiled that includes librados.hpp. Hmmm, that's not right! Can you elaborate? It just doesn't build librados.hpp unless you define NO_ATOMIC_OPS? Is there a build error at some point? > My actual question: is there a requirement/optimum on the relation between > number of objects and number of pools? Or may this be chosen to be something > completely arbitrary? There's no optimal relationship at all between the number of objects and the number of pools.Pools are logical groupings that have very little impact on performance. Placement groups are rather more important to performance, but again the number of objects/PG isn't terribly significant as long as #objects >> #PGs (for optimal performance you want ~100 PGs/OSD, but this number can vary significantly before it becomes a problem). > I.e., is it a problem to have a couple of hundred > IoCtxs active in one process? Not at all! > What is a reasonable/performance-wise-good object-size? Hmm, that depends on your specific hardware configuration. Ceph itself uses 4MB objects by default. In general the size of objects can vary a great deal without issue. There may still be some issues with very large objects (where object size is in the same order of magnitude as the OSD's RAM) but I think these have all been dealt with. In general the thing to be aware of is that objects are the atomic unit as far as librados and CRUSH are concerned -- placement within the cluster is pseudo-random by object, but if the size of your objects varies dramatically then the disk utilization can vary a lot without RADOS becoming aware of it. (There are mechanisms to deal with overloaded OSDs so it's not a dealbreaker, just something to be aware of.) Also, recovery and rebalancing and such, while logically handled by placement group, are largely dependent on the objects -- so if you have very large objects you may get strange behavior during recovery since the throttling and such are designed for objects in the several-MB range. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html