Serious performance problems with small file writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hugo,

On 20 Aug 2014, at 17:54, Hugo Mills <h.r.mills at reading.ac.uk> wrote:

>> What are you using for OSD journals?
> 
>   On each machine, the three OSD journals live on the same ext4
> filesystem on an SSD, which is also the root filesystem of the
> machine.
> 
>> Also check the CPU usage for the mons and osds...
> 
>   The mons are doing pretty much nothing in terms of CPU, as far as I
> can see. I will double-check during an incident.
> 
>> Does your hardware provide enough IOPS for what your users need?
>> (e.g. what is the op/s from ceph -w)
> 
>   Not really an answer to your question, but: Before the ceph cluster
> went in, we were running the system on two 5-year-old NFS servers for
> a while. We have about half the total number of spindles that we used
> to, but more modern drives.

NFS exported async or sync? If async, it can?t be compared to CephFS. Also, if those NFS servers had RAID cards with a wb-cache, it can?t really be compared.

> 
>   I'll look at how the op/s values change when we have the problem.
> At the moment (with what I assume to be normal desktop usage from the
> 3-4 users in the lab), they're flapping wildly somewhere around a
> median of 350-400, with peaks up to 800. Somewhere around 15-20 MB/s
> read and write.


Another tunable to look at is the filestore max sync interval ? in my experience the colocated journal/OSD setup suffers with the default (5s, IIRC), especially when an OSD is getting a constant stream of writes. When this happens, the disk heads are constantly seeking back and forth between synchronously writing to the journal and flushing the outstanding writes. If we would have a dedicated (spinning) disk for the journal, then the synchronous writes (to the journal) could be done sequentially (thus, quickly) and the flushes would also be quick(er). SSD journals can obviously also help with this.

For a short test I would try increasing filestore max sync interval to 30s or maybe even 60s to see if it helps. (I know that at least one of the Inktank experts advise against changing the filestore max sync interval ? but in my experience 5s is much too short for the colocated journal setup.) You need to make sure your journals are large enough to store 30/60s of writes, but when you have predominantly small writes even a few GB of journal ought to be enough. 

Cheers, Dan


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux