short pages when listing RADOSGW buckets via Swift API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I noticed while using rclone to migrate some data from a Swift
cluster into a RADOSGW cluster that sometimes when listing a
bucket RADOSGW will not always return as many results as specified
by the "limit" parameter, even when more objects remain to list.

This results in rclone believing on subsequent runs that the
objects do not exist, since it performs an initial comparison
based on bucket listings, and so it needlessly recopies data.

This seems contrary to how pagination is specified by Swift:

https://docs.openstack.org/swift/latest/api/pagination.html

Is this known behaviour, or should I go ahead and file a bug?

I believe the cluster is running 15.2.8 or so, but will confirm.

Thanks,
Paul

---

Further observations:

 * Here's a summary of the reply lengths I got when listing
   various buckets in our RADOSGW cluster.  (This is not all of
   the buckets in the tenant; the other 100 or so are fine.)

reply lengths: 1000 999 1000 1000 1000 1000 1000 1000 1000 1000 119
reply lengths: 1000 992 1000 1000 1000 1000 1000 935 1000 1000 257
reply lengths: 1000 1000 1000 1000 1000 975 1000 948
reply lengths: 953 1000 1000 1000 1000 1000 954 1000 1000 70
reply lengths: 1000 1000 1000 1000 998 15
reply lengths: 1000 1000 1000 1000 974 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 939 1000 1000 1000 1000 949 1000 1000 1000 644
reply lengths: 1000 1000 1000 1000 999 1000 1000 937 1000 1000 538
reply lengths: 1000 998 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 551
reply lengths: 1000 1000 1000 1000 1000 1000 1000 931 1000 986 1000 1000 1000 975 1000 989 1000 1000 1000 966 1000 998 921 994 1000 1000 973 58
reply lengths: 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 976 1000 366
reply lengths: 1000 1000 1000 1000 1000 983 1000 1000 1000 1000 1000 1000 1000 517
reply lengths: 1000 1000 1000 984 1000 1000 971 1000 1000 401
reply lengths: 949 1000 1000 1000 1000 1000 1000 403
reply lengths: 1000 998 532
reply lengths: 951 1000 1000 1000 1000 1000 976 1000 877

 * rclone uses a default $limit of 1,000, in contrast to the
   Python swiftclient's default of 10,000.

 * The Swift API doc seems clear that $limit results should always
   be returned if at least $limit results are available, and that
   receiving less than $limit results indicates no more exist.

   (It doesn't *explicitly* say the last, but the document could
   be a lot shorter if it were not intended for that to follow.)

 * When swiftclient is asked to fetch a listing, and full_listing
   is set to True, instead of implementing pagingation as
   described in the document above, swiftclient simply keeps
   fetching pages until it receives an empty page.

   So Swift API implementations that don't strictly implement
   paging per the docs may not even be noticed by most users.

 * From a review of its code, swiftclient seems to have done this
   since the very beginning.  Perhaps the code was written first
   and then pagination on the server side was nailed down later?

-- 
Paul Collins
Wellington, New Zealand
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux