Hi all, This patchset implement the journaling feature in kernel rbd, which makes mirroring in kubernetes possible. This is an RFC patchset, and it passed the /ceph/ceph/qa/workunits/rbd/rbd_mirror.sh, with a little change as below: ``` [root@atest-guest build]# git diff /ceph/ceph/qa/workunits/rbd/rbd_mirror_helpers.sh diff --git a/qa/workunits/rbd/rbd_mirror_helpers.sh b/qa/workunits/rbd/rbd_mirror_helpers.sh index e019de5..9d00d3e 100755 --- a/qa/workunits/rbd/rbd_mirror_helpers.sh +++ b/qa/workunits/rbd/rbd_mirror_helpers.sh @@ -854,9 +854,9 @@ write_image() test -n "${size}" || size=4096 - rbd --cluster ${cluster} -p ${pool} bench ${image} --io-type write \ - --io-size ${size} --io-threads 1 --io-total $((size * count)) \ - --io-pattern rand + rbd --cluster ${cluster} -p ${pool} map ${image} + fio --name=test --rw=randwrite --bs=${size} --runtime=60 --ioengine=libaio --iodepth=1 --numjobs=1 --filename=/dev/rbd0 --direct=1 --group_reporting --size $((size * count)) --group_reporting --eta-newline + rbd --cluster ${cluster} -p ${pool} unmap ${image} } stress_write_image() ``` That means this patchset is working well in mirroring data. There are some TODOs in comments, but most of them are about performance improvement. So I think it's a good timing to ask for comments from all of you guys. If you want to play with it, there is a simple script mirroring xfs below: ``` ./mstart.sh remote -k -l --bluestore ./mstart.sh local -k -l --bluestore rados -c ./run/local/ceph.conf rmpool rbd rbd --yes-i-really-really-mean-it rados -c ./run/remote/ceph.conf rmpool rbd rbd --yes-i-really-really-mean-it rados -c ./run/local/ceph.conf mkpool rbd rados -c ./run/remote/ceph.conf mkpool rbd rbd -c ./run/local/ceph.conf mirror pool enable rbd image rbd -c ./run/remote/ceph.conf mirror pool enable rbd image rbd -c ./run/local/ceph.conf mirror pool peer add rbd client.admin@remote rbd -c ./run/remote/ceph.conf mirror pool peer add rbd client.admin@local rbd -c ./run/remote/ceph.conf mirror pool info rbd rbd -c ./run/local/ceph.conf mirror pool info rbd rbd -c ./run/local/ceph.conf create test --image-feature layering --image-feature exclusive-lock --image-feature journaling -s 100M rbd -c ./run/local/ceph.conf mirror image enable test rbd -c ./run/remote/ceph.conf ls rbd -c ./run/local/ceph.conf map test mkfs.xfs /dev/rbd0 mount /dev/rbd0 /mnt/rbd0 dd if=/dev/urandom of=/mnt/rbd0/data bs=128K count=1 md5sum /mnt/rbd0/data sync data_1=`md5sum /mnt/rbd0/data|awk '{print $1}'` umount /mnt/rbd0 rbd unmap /dev/rbd0 until rbd -c ./run/remote/ceph.conf ls |grep test; do sleep 1 done rbd -c ./run/local/ceph.conf mirror image demote test sleep 3 rbd -c ./run/remote/ceph.conf mirror image promote test rbd -c ./run/remote/ceph.conf map test mount /dev/rbd0 /mnt/rbd0 md5sum /mnt/rbd0/data data_2=`md5sum /mnt/rbd0/data|awk '{print $1}'` echo data_1: $data_1 echo data_2: $data_2 if (( "$data_1" != "$data_2" )); then echo "failed" else echo "pass" fi umount /mnt/rbd0 rbd unmap /dev/rbd0 exit ``` Any comment is welcome! Dongsheng Yang (4): libceph: support op append libceph: introduce cls_journaler_client libceph: introduce generic journaling rbd: enable journaling drivers/block/rbd.c | 478 +++++++++++- include/linux/ceph/cls_journaler_client.h | 87 +++ include/linux/ceph/journaler.h | 131 ++++ net/ceph/Makefile | 3 +- net/ceph/cls_journaler_client.c | 501 ++++++++++++ net/ceph/journaler.c | 1208 +++++++++++++++++++++++++++++ net/ceph/osd_client.c | 13 +- 7 files changed, 2409 insertions(+), 12 deletions(-) create mode 100644 include/linux/ceph/cls_journaler_client.h create mode 100644 include/linux/ceph/journaler.h create mode 100644 net/ceph/cls_journaler_client.c create mode 100644 net/ceph/journaler.c -- 1.8.3.1