On Wed, 16 Oct 2013, Amit Tiwary wrote: > We are using ceph version 0.56.6, librados C++ APIs and have more than 750 > million objects in a single pool. Objects are named as "domain-name_file- > name". > > We are unable to ascertain in what order objects are listed with the command > "rados -p poolname ls". They are neither ordered on objectname, nor size or > mtime. > > Q1) Is there any way we can control the way objects are scanned/listed in a > pool with the below librados c++ code? We are interested in getting list of > objects sorted or grouped by object name > librados::ObjectIterator it = ioctx.objects_begin(); > for (; it != ioctx.objects_end(); ++it) > ... They are ordered by hash(object name). > Q2) In near future, if we upgrade and make use of namespaces (i.e make > domain-name as namespace and store all objects of a particular domain in that > namespace); would scanning of objects in a namespace be efficient than > current scenario where we have to scan the entire pool to fetch all objects? The namespaces do not improve object listing efficiency; it is still O(size of the pool). > Q3) Do you any other recommendations on top of your mind that can improve > time required to scan all objects of pool/namespace? If you need efficient queries by name prefix (or whatever else) you need to maintain some sort of seperate index. Radosgw does this for the S3 interface by using key/value objects for each bucket. The kvstore class implements a btree on top of such objects to provide improved scalability. Hope that helps! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html