On 21 September 2016 at 17:28, Haomai Wang <haomai@xxxxxxxx> wrote: > BTW, why you need to iterate so much objects..... I think it should be > done by other ways to achieve the goal. > Mostly it's just a brute force way to identify objects that shouldn't exist, or objects that have been orphaned (e.g: last modification time was over 60 days ago). This house keeping probably wouldn't be needed if it was possible to rely on the storage platform and the index holding reference to all objects stored being always correct. In reality, strange things happen - data was never written, or goes missing during or after a migration, or disk failure, etc... When the lifecycle of an object ends, it gets removed from index, and a deleted from disk. Again, reality - data was never deleted, or gets recreated during a migration, etc... :-) This goes back to iterating all objects and validating that there's nothing unexpected still on disk. Now that I have (mostly) one region migrated over to Ceph, maybe there will start being less reliance on this sort of house keeping. But the constant stating of objects for its existence must always happen during periodical refreshes. But from what I gather from my local tests, and feedback on here, it's seems like there should be room for ample improvement on object iteration. If I request the an object, via rados_nobjects_list_next(), the chances of me asking for the next object via the same callback should be pretty high, right? And it would do no harm prefetching that data before it's requested by the rados client. -- Iain Buclaw _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com