Re: RBD snapshots cause disproportionate performance degradation

Haomai Wang <haomaiwang@xxxxxxxxx> · Thu, 19 Nov 2015 18:36:09 +0800

On Thu, Nov 19, 2015 at 11:13 AM, Will Bryant <will.bryant@xxxxxxxxx> wrote:
> Hi Haomai,
>
> Thanks for that suggestion.  To test it out, I have:
>
> 1. upgraded to 3.19 kernel
> 2. added filestore_fiemap = true to my ceph.conf in the [osd] section
> 3. wiped and rebuild the ceph cluster
> 4. recreated the RBD volume
>
> But I am still only getting around 120 IOPS after a snapshot.  The logs say
> the fiemap feature should be working:
>
> 2015-11-19 12:08:13.864199 7f15c937e900  0
> genericfilestorebackend(/var/lib/ceph/osd/powershop/ceph_test-osd-nwtn4_20151118-230958.997577000)
> detect_features: FIEMAP ioctl is supported and appears to work
> 2015-11-19 12:08:15.023310 7f42cfe72900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP
> ioctl is supported and appears to work
>
> Is there anything else I need to do to take advantage of this option?

Hmm, what's the actual capacity usage in this volume? Fiemap could
help a lot to a normal workload volume like sparse data distribution.

I tested the master branch with this case, it showed about 30% iops
compared with before creating snapshot.

>
> One more thing I forgot to mention earlier - if I watch dstat on the OSD
> hosts I notice that the CPU usage and disk IO both drop dramatically when I
> take a snapshot.  In other words it seems to be waiting for something, or
> dropping down to basically single-threaded:
>
> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
> usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
>  61  19  10   8   0   4|   0   262M|  63M   98M|   0     0 |  50k  206k
>  58  20   8   9   0   5|   0   284M|  64M  102M|   0     0 |  59k  182k
>  51  13  15  19   0   3|   0   274M|  30M   48M|   0     0 |  51k  143k
>  34   5  32  27   0   1|   0   267M|2595k 4376k|   0     0 |  55k   66k
>  38   2  58   2   0   0|   0    57M|1006k 1557k|   0     0 |4811    24k
>  38   2  58   2   0   0|   0    56M|1667k 2152k|   0     0 |4052    23k
>  39   2  57   2   0   0|   0    58M|1711k 2187k|   0     0 |4062    23k
>  38   2  58   3   0   0|   0    60M| 910k 1373k|   0     0 |4075    22k

Hmm, it's really a strange result for me. AFAR, it should be a burst
bandwidth at least. From my tests, the disk bandwidth is keeping high
level.

What's your ceph env?

>
> Cheers,
> Will
>
>
> On 18/11/2015, at 19:12 , Haomai Wang <haomaiwang@xxxxxxxxx> wrote:
>
> Yes, it's a expected case. Actually if you use Hammer, you can enable
> filestore_fiemap to use sparse copy which especially useful for rbd
> snapshot copy. But keep in mind some old kernel are *broken* in
> fiemap. CentOS 7 is only the distro I verfied fine to this feature.
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
Best Regards,

Wheat
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com