Ceph RBD performance - random writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been looking at using Ceph RBD as a block store for database use. As part of this I'm looking a how (particularly random) IO of smallish (4K, 8K) block sizes performs.

I've setup Ceph with a single osd and mon spread over two SSD (Intel 520) - 2G journal on one and the osd data on the other (xfs filesystem). The Intel's are pretty fast, and (despite being shackled by a crappy Nvidia SATA controller) fly for random IO.

However I am not seeing that reflected in the RBD case. I have the device mounted on the local machine where the osd and mon are running (so network performance should not be a factor here).

Here is what I did:

Create a rbd device of 10G and mount on /mnt/vol0:

$ rbd create --size 10240 vol0
$ rbd map vol0
$ mkfx.xfs /dev/rbd0
$ rbd mount /dev/rdb0 /mnt/vol0

Make a file:

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4k count=300000 conv=fsync
1228800000 bytes (1.2 GB) copied, 13.4361 s, 91.5 MB/s

Performance ok if file size < journal (2G).

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4096k count=200 conv=fsync
838860800 bytes (839 MB) copied, 9.47086 s, 88.6 MB/s

Not so good if file size > journal.

$ dd if=/dev/zero of=/mnt/vol0/dump/file bs=4096k count=1000 conv=fsync
4194304000 bytes (4.2 GB) copied, 279.891 s, 15.0 MB/s

Random writes (see attached file) sync'ed with sync_file_range are ok if block size big:

$ ./writetest /mnt/vol0/dump/file 4194304 0 1
random writes: 292 of: 4194304 bytes elapsed: 9.8397s io rate: 30/s (118.70 MB/s)

$ ./writetest /mnt/vol0/dump/file 1048576 0 1
random writes: 1171 of: 1048576 bytes elapsed: 10.6042s io rate: 110/s (110.43 MB/s)

$ ./writetest /mnt/vol0/dump/file 131072 0 1
random writes: 9375 of: 131072 bytes elapsed: 15.8075s io rate: 593/s (74.13 MB/s)


However smallish block size is suicide (trigger suicide assert after a while), I see 100 IOPS or less on actual devices, all 100% util:

$ ./writetest /mnt/vol0/dump/file 8192 0 1

I am running into http://tracker.newdream.net/issues/2784 here I think.

Note that the actual SSD are very fast for this when accessed directly:

$ ./writetest /data1/ceph/1/file 8192 0 1
random writes: 1000000 of: 8192 bytes elapsed: 125.7907s io rate: 7950/s (62.11 MB/s)


Thanks for your patience in reading so far - some actual questions now :-)

1/ Why is the appending write from dd when the size of file > journal so slow, despite reasonably capable storage devices?

2/ Is the sudden dramatic drop in random write performance a manifestation of the "small requests are slow" issue? or is this something else?


Thanks

Mark


Attachment: ceph.conf.gz
Description: GNU Zip compressed data

Attachment: writetest.c.gz
Description: GNU Zip compressed data


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux