Re: Ceph Journal Disk Size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Regarding using spinning disks for journals, before I was able to put SSDs in my deployment I came up wit ha somewhat novel journal setup that gave my cluster way more life than having all the journals on a single disk, or having the journal on the disk with the OSD. I called it "interleaved journals". Essentially offset the journal location by one disk, so in a 4 disk system:

OS disk sda has journal for sdb OSD
sdb OSD disk has journal for sdc OSD
sdc OSD disk has journal for sdd OSD
sdd OSD disk has no journal on it

This limited the contention substantially. When the cluster got busy enough that multiple OSDs on the same machine were writing simultaneously it still took a hit, but it was a big upgrade from the out of the box deployment. I also tried leaving the OS drive out and only interleaving the journals on the OSD drives, but that was slightly worse under load than this configuration. It seems that the contention of the journals and OSDs was stronger than the contention with logging.

QH 

On Fri, Jul 3, 2015 at 1:23 AM, Van Leeuwen, Robert <rovanleeuwen@xxxxxxxx> wrote:
> Another issue is performance : you'll get 4x more IOPS with 4 x 2TB drives than with one single 8TB.
> So if you have a performance target your money might be better spent on smaller drives

Regardless of the discussion if it is smart to have very large spinners: 
Be aware that some of the bigger drives use SMR technology. 
Quoting wikipedia on SMR:
"shingled recording writes new tracks that overlap part of the previously written magnetic track, leaving the previous track thinner and allowing for higher track density.”
and
"The overlapping-tracks architecture may slow down the writing process since writing to one track overwrites adjacent tracks, and requires them to be rewritten as well."

Usually these these disks are marketed "for archival use".
Generally speaking you really should not use these unless you exactly know which write workload is hitting the disk and it is just very big sequential writes.

Cheers,
Robert van Leeuwen

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux