Hi Ilya, Jason and all: This is V2 for rbd journaling. Testing: It passes krbd suite in my teuthology testing, 47 passed, 0 failed. http://218.94.118.90:8082/teuthworker-2019-03-16_14:49:51-krbd-krbd_mirror_qa-distro-basic-plana/ kernel branch: https://github.com/yangdongsheng/linux/tree/journaling_rebase ceph branch: https://github.com/yangdongsheng/ceph/tree/krbd_mirror_qa test suite: teuthology-suite -v -s krbd -c krbd_mirror_qa -m plana -S d557744d538fe02e167e0513a7dd261a11c48d88 --filter-out "rbd_workunit_suites_ffsb.yaml,rbd_workunit_suites_iozone.yaml,rbd_xfstests.yaml,rbd_simple_big,pre-single-major,rbd_concurrent,rbd_kfsx" (1): I filtered out some cases because it failed but not related with krbd journaling, such as rbd_simple_big and others. (2): I filtered out the rbd_concurrent because of http://tracker.ceph.com/issues/38553. This is a problem in rbd remove command. (3): A new test added: workunits/rbd/kernel_journal.sh: to test the journal replaying in krbd, it will be improved when we support more event types, such as snapshot. (4): A new test added: qa/suites/krbd/mirror/, this test krbd journaling with rbd-mirror daemon. Performance: compared with librbd journaling, preformance of krbd journaling looks reasonable. ------------------------------------------------------------------------------------- (1) rbd bench with journaling disabled: | IOPS: 204 ------------------------------------------------------------------------------------- (2) rbd bench with journaling enabled: | IOPS: 81 ------------------------------------------------------------------------------------- (3) fio krbd with journaling disabled: | IOPS: 210 ------------------------------------------------------------------------------------- (4) fio krbd with journaling enabled: | IOPS: 92 ------------------------------------------------------------------------------------- Changelog: -V1 1. add test case in qa 2. address all memleak found in kmemleak 3. several bug fixes 4. performance improvement. -RFC 1. error out if there is some unsupported event type in replaying 2. just one memory copy from bio to msg. 3. use async IO in journal appending. 4. no mutex around IO. Any comments are welcome!! Dongsheng Yang (16): libceph: introduce ceph_extract_encoded_string_kvmalloc libceph: introduce a new parameter of workqueue in ceph_osdc_watch() libceph: support op append libceph: add prefix and suffix in ceph_osd_req_op.extent libceph: introduce cls_journaler_client libceph: introduce generic journaling libceph: journaling: introduce api to replay uncommitted journal events libceph: journaling: introduce api for journal appending libceph: journaling: trim object set when we found there is no client refer it rbd: wakeup requests when we get -EBLACKLISTED in lock acquiring rbd: wait image request all complete in lock releasing rbd: introduce completion for each img_request rbd: introduce journal in rbd_device rbd: append journal first before sending img_request rbd: replay events in journal rbd: add support for feature of RBD_FEATURE_JOURNALING drivers/block/rbd.c | 688 +++++++++- include/linux/ceph/cls_journaler_client.h | 91 ++ include/linux/ceph/decode.h | 21 +- include/linux/ceph/journaler.h | 180 +++ include/linux/ceph/osd_client.h | 19 + net/ceph/Makefile | 3 +- net/ceph/cls_journaler_client.c | 556 ++++++++ net/ceph/journaler.c | 1992 +++++++++++++++++++++++++++++ net/ceph/osd_client.c | 61 +- 9 files changed, 3580 insertions(+), 31 deletions(-) create mode 100644 include/linux/ceph/cls_journaler_client.h create mode 100644 include/linux/ceph/journaler.h create mode 100644 net/ceph/cls_journaler_client.c create mode 100644 net/ceph/journaler.c -- 1.8.3.1