Re: Intel 520/530 SSD for ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 18, 2013 at 02:38:42PM +0100, Stefan Priebe - Profihost AG wrote:
> Hi guys,
> 
> in the past we've used intel 520 ssds for ceph journal - this worked
> great and our experience was good.
> 
> Now they started to replace the 520 series with their new 530.
> 
> When we did we were supriced by the ugly performance and i need some
> days to reproduce.
> 
> While O_DIRECT works fine for both and the intel ssd 530 is even faster
> than the 520.
> 
> O_DSYNC... see the results:
> 
> ~# dd if=randfile.gz of=/dev/sda bs=350k count=10000 oflag=direct,dsync
> 3584000000 bytes (3,6 GB) copied, 22,287 s, 161 MB/s
> 
> ~# dd if=randfile.gz of=/dev/sdb bs=350k count=10000 oflag=direct,dsync
> 3584000000 bytes (3,6 GB) copied, 136,505 s, 26,3 MB/s
> 
> I used a blocksize of 350k as my graphes shows me that this is the
> average workload we have on the journal. But i also tried using fio,
> bigger blocksize, ... it stays the same.
> 
> Does anybody have an idea? Without dsync both devices have around the
> same performance of 260MB/s.
> 
> Greets,
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

You may actually be doing O_SYNC - recent kernels implement O_DSYNC,
but glibc maps O_DSYNC into O_SYNC.  But since you're writing to the
block device this won't matter much.

I believe the effect of O_DIRECT by itself is just to bypass the buffer
cache, which is not going to make much difference for your dd case.
(It will mainly affect other applications that are also using the
buffer cache...)

O_SYNC should be causing the writes to block until a response
is received from the disk.  Without O_SYNC, the writes will
just queue operations and return - potentially very fast.
Your dd is probably writing enough data that there is some
throttling by the system as it runs out of disk buffers and
has to wait for some previous data to be written to the drive,
but the delay for any individual block is not likely to matter.
With O_SYNC, you are measuring the delay for each block directly,
and you have absolutely removed the ability for the disk to
perform any sort of parallism.
	[It's also conceivable the kernel is sending some form of write
	barrier flag to the drive, which will slow it down further,
	but I can't find any kernel logic that does this at a quick glance.]
Sounds like the intel 530 is has a much larger block write latency,
but can make up for it by performing more overlapped operations.

You might be able to vary this behavior by experimenting with sdparm,
smartctl or other tools, or possibly with different microcode in the drive.

				-Marcus Watts
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux