Re: [PATCH 00/20] rbd: support for object-map and fast-diff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





re-reading 3), i realize the queue depth will not be exceeded in the request queue.


On 30/06/2019 15:55, Maged Mokhtar wrote:


Hi Ilya,

Nice work. Some comments/questions:

1) Generally having a state machine makes things easier to track as the code is less dispersed than before.

2) The running_list is used to keep track of inflight requests in case of exclusive lock to support rbd_quiesce_lock() when releasing the lock. It would be great to generalize this list to keep track of all inflight requests even in the case when lock is not required, it could be used to support a generic flush (similar to librbd work queue flush/drain). Having a generic inflight requests list will also make it easier to support other functions such as task aborts, request timeouts, flushing on pre-snapshots.

3) For the acquiring_list, my concern is that while the lock is pending to be acquired, the requests are being accepted without a limit. In case there is a delay acquiring the lock, for example if the primary of the object header is down (which could block for ~ 25 sec) or worse if the pool is inactive, the count could well exceed the max queue depth + for write requests this can consume a lot of memory.

4) In rbd_img_exclusive_lock() at end, we queue an acquire lock task for every request. I understand this is a single threaded queue and if lock is acquired then all acquire tasks are cancelled, however i feel the queue could fill a lot. Any chance we can only schedule 1 acquire task ?

5) The state RBD_IMG_EXCLUSIVE_LOCK is used for both cases when image does not require exclusive lock + when lock is acquired. Cosmetically it may be better to separate them.

6) Probably not an issue, but we are now creating a large number of mutexes, at least 2 for every request. Maybe we need to test in high iops/queue depths that there is no overhead for this.

7) Is there any consideration to split the rbd module to multiple files ? Looking at how big librbd, fitting this in a single kernel file is challenging at best.

/Maged

On 25/06/2019 16:40, Ilya Dryomov wrote:
Hello,

This series adds support for object-map and fast-diff image features.
Patches 1 - 11 prepare object and image request state machines; patches
12 - 14 fix most of the shortcomings in our exclusive lock code, making
it suitable for guarding the object map; patches 15 - 18 take care of
the prerequisites and finally patches 19 - 20 implement object-map and
fast-diff.

Thanks,

                 Ilya


Ilya Dryomov (20):
   rbd: get rid of obj_req->xferred, obj_req->result and img_req->xferred
   rbd: replace obj_req->tried_parent with obj_req->read_state
   rbd: get rid of RBD_OBJ_WRITE_{FLAT,GUARD}
   rbd: move OSD request submission into object request state machines
   rbd: introduce image request state machine
   rbd: introduce obj_req->osd_reqs list
   rbd: factor out rbd_osd_setup_copyup()
   rbd: factor out __rbd_osd_setup_discard_ops()
   rbd: move OSD request allocation into object request state machines
   rbd: rename rbd_obj_setup_*() to rbd_obj_init_*()
   rbd: introduce copyup state machine
   rbd: lock should be quiesced on reacquire
   rbd: quiescing lock should wait for image requests
   rbd: new exclusive lock wait/wake code
   libceph: bump CEPH_MSG_MAX_DATA_LEN (again)
   libceph: change ceph_osdc_call() to take page vector for response
   libceph: export osd_req_op_data() macro
   rbd: call rbd_dev_mapping_set() from rbd_dev_image_probe()
   rbd: support for object-map and fast-diff
   rbd: setallochint only if object doesn't exist

  drivers/block/rbd.c                  | 2433 ++++++++++++++++++--------
  drivers/block/rbd_types.h            |   10 +
  include/linux/ceph/cls_lock_client.h |    3 +
  include/linux/ceph/libceph.h         |    6 +-
  include/linux/ceph/osd_client.h      |   10 +-
  include/linux/ceph/striper.h         |    2 +
  net/ceph/cls_lock_client.c           |   47 +-
  net/ceph/osd_client.c                |   18 +-
  net/ceph/striper.c                   |   17 +
  9 files changed, 1817 insertions(+), 729 deletions(-)



--
Maged Mokhtar
CEO PetaSAN
4 Emad El Deen Kamel
Cairo 11371, Egypt
www.petasan.org
+201006979931
skype: maged.mokhtar



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux