On Mon, Dec 31, 2012 at 3:12 AM, Jens Kristian Søgaard <jens@xxxxxxxxxxxxxxxxxxxx> wrote: > Hi Andrey, > > Thanks for your reply! > > >> You may try do play with SCHED_RT, I have found it hard to use for >> myself, but you can achieve your goal by adding small RT slices via >> ``cpu'' cgroup to vcpu/emulator threads, it dramatically increases >> overall VM` responsibility. > > > I'm not quite sure I understand your suggestion. > > Do you mean that you set the process priority to real-time on each qemu-kvm > process, and then use cgroups cpu.rt_runtime_us / cpu.rt_period_us to > restrict the amount of CPU time those processes can receive? > > I'm not sure how that would apply here, as I have only one qemu-kvm process > and it is not non-responsive because of the lack of allocated CPU time > slices - but rather because some I/Os take a long time to complete, and > other I/Os apparently have to wait for those to complete. > Yep, I meant the same. Of course it`ll not help with only one VM, RT may help in more concurrent cases :) > >> threads. Of course, some Ceph tuning like writeback cache and large >> journal may help you too, I`m speaking primarily of VM` performance by > > > I have been considering the journal as something where I could improve > performance by tweaking the setup. I have set aside 10 GB of space for the > journal, but I'm not sure if this is too little - or if the size really > doesn't matter that much when it is on the same mdraid as the data itself. > > Is there a tool that can tell me how much of my journal space that is > actually actively being used? > > I.e. I'm looking for something that could tell me, if increasing the size of > the journal or placing it on a seperate (SSD) disk could solve my problem. As I understood right, you have md device holding both journal and filestore? What type of raid you have here? Of course you`ll need a separate device (for experimental purposes, fast disk may be enough) for the journal, and if you set any type of redundant storage under filestore partition, you may also change it to simple RAID0, or even separate disks, and create one osd over every disk(you should see to the journal device` throughput which must be equal to sum of speeds of all filestore devices, so for commodity-type SSD it sums to two 100MB/s disks, for example). I have ``pure'' disk setup in my dev environment built on quite old desktop-class machines and one rsync process may hang VM for short time, despite of using dedicated SATA disk for journal. > > How do I change the size of the writeback cache when using qemu-kvm like I > do? > > Does setting rbd cache size in ceph.conf have any effect on qemu-kvm, where > the drive is defined as: > > format=rbd,file=rbd:data/image1:rbd_cache=1,if=virtio > What size of cache_size/max_dirty you have inside ceph.conf and which qemu version you use? Default values good enough to prevent pushing I/O spikes down to the physical storage, but for long I/O-intensive tasks increasing cache may help OS to align writes more smoothly. Also you don`t need to set rbd_cache explicitly in the disk config using qemu 1.2 and younger releases, for older ones http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg02500.html should be applied. > > -- > Jens Kristian Søgaard, Mermaid Consulting ApS, > jens@xxxxxxxxxxxxxxxxxxxx, > http://www.mermaidconsulting.com/ -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html