On 17 October 2013 12:25, Amit Tiwary <tiwaryamt@xxxxxxxxx> wrote: > Sage Weil <sage <at> inktank.com> writes: > >> >> On Wed, 16 Oct 2013, Amit Tiwary wrote: >> > We are using ceph version 0.56.6, librados C++ APIs and have more than > 750 >> > million objects in a single pool. Objects are named as "domain- > name_file- >> > name". >> > >> > We are unable to ascertain in what order objects are listed with the > command >> > "rados -p poolname ls". They are neither ordered on objectname, nor size > or >> > mtime. >> > >> > Q1) Is there any way we can control the way objects are scanned/listed > in a >> > pool with the below librados c++ code? We are interested in getting list > of >> > objects sorted or grouped by object name >> > librados::ObjectIterator it = ioctx.objects_begin(); >> > for (; it != ioctx.objects_end(); ++it) >> > ... >> >> They are ordered by hash(object name). >> >> > Q2) In near future, if we upgrade and make use of namespaces (i.e make >> > domain-name as namespace and store all objects of a particular domain in > that >> > namespace); would scanning of objects in a namespace be efficient than >> > current scenario where we have to scan the entire pool to fetch all > objects? >> >> The namespaces do not improve object listing efficiency; it is still >> O(size of the pool). >> >> > Q3) Do you any other recommendations on top of your mind that can > improve >> > time required to scan all objects of pool/namespace? >> >> If you need efficient queries by name prefix (or whatever else) you need >> to maintain some sort of seperate index. Radosgw does this for the S3 >> interface by using key/value objects for each bucket. The kvstore class >> implements a btree on top of such objects to provide improved scalability. >> >> Hope that helps! >> sage > > > Thanks Yuan and Sage. I feel namespace level stats would definitely be a > good thing to have in future releases. > > I have been lately following threads regarding optimal number of pools and I > understand that increasing the number of pools increases memory footprint > and can become a bottleneck. > Does number of objects or amount of data in each pool (i.e if number of > objects or data is highly skewed in different pools) has any impact on > performance of ceph. In my opinion , there is no impact. By default crush map, the pool distributes over all osds.. > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Dong Yuan Email:yuandong1222@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html