Hello Jan, I am testing your scripts, because we want also to test OSDs and VMs on the same server. I am new to cgroups, so this might be a very newbie question. In your script you always reference to the file /cgroup/cpuset/libvirt/cpuset.cpus but I have the file in /sys/fs/cgroup/cpuset/libvirt/cpuset.cpus I am working on Ubuntu 14.04 This difference comes from something special in your setup, or maybe because we are working on different Linux distributions ? Thanks for clarification. Saverio 2015-06-30 17:50 GMT+02:00 Jan Schermer <jan@xxxxxxxxxxx>: > Hi all, > our script is available on GitHub > > https://github.com/prozeta/pincpus > > I haven’t had much time to do a proper README, but I hope the configuration > is self explanatory enough for now. > What it does is pin each OSD into the most “empty” cgroup assigned to a NUMA > node. > > Let me know how it works for you! > > Jan > > > On 30 Jun 2015, at 10:50, Huang Zhiteng <winston.d@xxxxxxxxx> wrote: > > > > On Tue, Jun 30, 2015 at 4:25 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote: >> >> Not having OSDs and KVMs compete against each other is one thing. >> But there are more reasons to do this >> >> 1) not moving the processes and threads between cores that much (better >> cache utilization) >> 2) aligning the processes with memory on NUMA systems (that means all >> modern dual socket systems) - you don’t want your OSD running on CPU1 with >> memory allocated to CPU2 >> 3) the same goes for other resources like NICs or storage controllers - >> but that’s less important and not always practical to do >> 4) you can limit the scheduling domain on linux if you limit the cpuset >> for your OSDs (I’m not sure how important this is, just best practice) >> 5) you can easily limit memory or CPU usage, set priority, with much >> greater granularity than without cgroups >> 6) if you have HyperThreading enabled you get the most gain when the >> workloads on the threads are dissimiliar - so to have the higher throughput >> you have to pin OSD to thread1 and KVM to thread2 on the same core. We’re >> not doing that because latency and performance of the core can vary >> depending on what the other thread is doing. But it might be useful to >> someone. >> >> Some workloads exhibit >100% performance gain when everything aligns in a >> NUMA system, compared to a SMP mode on the same hardware. You likely won’t >> notice it on light workloads, as the interconnects (QPI) are very fast and >> there’s a lot of bandwidth, but for stuff like big OLAP databases or other >> data-manipulation workloads there’s a huge difference. And with CEPH being >> CPU hungy and memory intensive, we’re seeing some big gains here just by >> co-locating the memory with the processes…. > > Could you elaborate a it on this? I'm interested to learn in what situation > memory locality helps Ceph to what extend. >> >> >> >> Jan >> >> >> >> On 30 Jun 2015, at 08:12, Ray Sun <xiaoquqi@xxxxxxxxx> wrote: >> >> Sound great, any update please let me know. >> >> Best Regards >> -- Ray >> >> On Tue, Jun 30, 2015 at 1:46 AM, Jan Schermer <jan@xxxxxxxxxxx> wrote: >>> >>> I promised you all our scripts for automatic cgroup assignment - they are >>> in our production already and I just need to put them on github, stay tuned >>> tomorrow :-) >>> >>> Jan >>> >>> >>> On 29 Jun 2015, at 19:41, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote: >>> >>> Presently, you have to do it by using tool like ‘taskset’ or ‘numactl’… >>> >>> Thanks & Regards >>> Somnath >>> >>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >>> Ray Sun >>> Sent: Monday, June 29, 2015 9:19 AM >>> To: ceph-users@xxxxxxxxxxxxxx >>> Subject: How to use cgroup to bind ceph-osd to a specific >>> cpu core? >>> >>> Cephers, >>> I want to bind each of my ceph-osd to a specific cpu core, but I didn't >>> find any document to explain that, could any one can provide me some >>> detailed information. Thanks. >>> >>> Currently, my ceph is running like this: >>> >>> oot 28692 1 0 Jun23 ? 00:37:26 /usr/bin/ceph-mon -i >>> seed.econe.com --pid-file /var/run/ceph/mon.seed.econe.com.pid -c >>> /etc/ceph/ceph.conf --cluster ceph >>> root 40063 1 1 Jun23 ? 02:13:31 /usr/bin/ceph-osd -i 0 >>> --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf --cluster ceph >>> root 42096 1 0 Jun23 ? 01:33:42 /usr/bin/ceph-osd -i 1 >>> --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph >>> root 43263 1 0 Jun23 ? 01:22:59 /usr/bin/ceph-osd -i 2 >>> --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph >>> root 44527 1 0 Jun23 ? 01:16:53 /usr/bin/ceph-osd -i 3 >>> --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph >>> root 45863 1 0 Jun23 ? 01:25:18 /usr/bin/ceph-osd -i 4 >>> --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph >>> root 47462 1 0 Jun23 ? 01:20:36 /usr/bin/ceph-osd -i 5 >>> --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph >>> >>> Best Regards >>> -- Ray >>> >>> ________________________________ >>> >>> PLEASE NOTE: The information contained in this electronic mail message is >>> intended only for the use of the designated recipient(s) named above. If the >>> reader of this message is not the intended recipient, you are hereby >>> notified that you have received this message in error and that any review, >>> dissemination, distribution, or copying of this message is strictly >>> prohibited. If you have received this communication in error, please notify >>> the sender by telephone or e-mail (as shown above) immediately and destroy >>> any and all copies of this message in your possession (whether hard copies >>> or electronically stored copies). >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > > -- > Regards > Huang Zhiteng > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com