Re: Posix AIO vs libaio read performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



librbd doesn't know that you are using libaio vs POSIX AIO. Therefore,
the best bet is that the issue is in fio or glibc. As a first step, I
would recommend using blktrace (or similar) within your VM to
determine if there is a delta between libaio and POSIX AIO at the
block level.

On Fri, Mar 10, 2017 at 12:28 PM, Xavier Trilla
<xavier.trilla@xxxxxxxxxxxxxxxx> wrote:
> I disabled rbd cache but no improvement, just a huge performance drop in
> writes (Which proves the cache was properly disabled).
>
>
>
> Now I’m working on two other fronts:
>
>
>
> -        Using librbd with jemalloc in the Hypervisors (Hammer .10)
>
> -        Compiling QEMU with jemalloc (QEMU 2.6)
>
> -        Running some tests from a Bare Metal server using FIO tool, but it
> will use the librbd directly so no way to simulate POSIX AIO (Maybe I’ll try
> via KRBD)
>
>
>
> I’m quite sure is something on the client side, but I don’t know enough
> about the Ceph internals to totally discard the issue being related to OSDs.
> But so far performance of the OSDs is really good using other test engines,
> so I’m working more on the client side.
>
>
>
> Any help or information would be really welcome J
>
>
>
> Thanks.
>
> Xavier.
>
>
>
> De: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] En nombre de
> Xavier Trilla
> Enviado el: viernes, 10 de marzo de 2017 14:13
> Para: Alexandre DERUMIER <aderumier@xxxxxxxxx>
> CC: ceph-users <ceph-users@xxxxxxxxxxxxxx>
> Asunto: Re:  Posix AIO vs libaio read performance
>
>
>
> Hi Alexandre,
>
>
>
> Debugging is disabled in client and osds.
>
>
>
> Regarding rbd cache, is something I will try -today I was thinking about it-
> but I did not try it yet because I don't want to reduce write speed.
>
>
>
> I also tried iothreads, but no benefit.
>
>
>
> I tried as well with virtio-blk and virtio-scsi, there is a small
> improvement with virtio-blk, but it's around a 10%.
>
>
>
> This is becoming a quite strange issue, as it only affects posix aio read
> performance. Nothing less seems to be affected -although posix aio write
> isn't nowhere near libaio performance-.
>
>
>
> Thanks for you help, if you have any other ideas they will be really
> appreciated.
>
>
>
> Also if somebody could run in their cluster from inside a VM the following
> command:
>
>
>
> fio --name=randread-posix --output ./test --runtime 60 --ioengine=posixaio
> --buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32
>
>
>
> It would be really helpful to know if I'm the only one affected or this is
> happening in all qemu + ceph setups.
>
> Thanks!
>
> Xavier
>
>
> El 10 mar 2017, a las 8:07, Alexandre DERUMIER <aderumier@xxxxxxxxx>
> escribió:
>
>
>
> But it still looks like there is some bottleneck in QEMU o Librbd I cannot
> manage to find.
>
>
> you can improve latency on client with disable debug.
>
> on your client, create a /etc/ceph/ceph.conf with
>
> [global]
> debug asok = 0/0
> debug auth = 0/0
> debug buffer = 0/0
> debug client = 0/0
> debug context = 0/0
> debug crush = 0/0
> debug filer = 0/0
> debug filestore = 0/0
> debug finisher = 0/0
> debug heartbeatmap = 0/0
> debug journal = 0/0
> debug journaler = 0/0
> debug lockdep = 0/0
> debug mds = 0/0
> debug mds balancer = 0/0
> debug mds locker = 0/0
> debug mds log = 0/0
> debug mds log expire = 0/0
> debug mds migrator = 0/0
> debug mon = 0/0
> debug monc = 0/0
> debug ms = 0/0
> debug objclass = 0/0
> debug objectcacher = 0/0
> debug objecter = 0/0
> debug optracker = 0/0
> debug osd = 0/0
> debug paxos = 0/0
> debug perfcounter = 0/0
> debug rados = 0/0
> debug rbd = 0/0
> debug rgw = 0/0
> debug throttle = 0/0
> debug timer = 0/0
> debug tp = 0/0
>
>
> you can also disable rbd_cache=false   or in qemu set cache=none.
>
> Using iothread on qemu drive should help a little bit too.
>
> ----- Mail original -----
> De: "Xavier Trilla" <xavier.trilla@xxxxxxxxxxxxxxxx>
> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
> Envoyé: Vendredi 10 Mars 2017 05:37:01
> Objet: Re:  Posix AIO vs libaio read performance
>
>
>
> Hi,
>
>
>
> We compiled Hammer .10 to use jemalloc and now the cluster performance
> improved a lot, but POSIX AIO operations are still quite slower than libaio.
>
>
>
> Now with a single thread read operations are about 1000 per second and write
> operations about 5000 per second.
>
>
>
> Using same FIO configuration, but libaio read operations are about 15K per
> second and writes 12K per second.
>
>
>
> I’m compiling QEMU with jemalloc support as well, and I’m planning to
> replace librbd in QEMU hosts to the new one using jemalloc.
>
>
>
> But it still looks like there is some bottleneck in QEMU o Librbd I cannot
> manage to find.
>
>
>
> Any help will be much appreciated.
>
>
>
> Thanks.
>
>
>
>
>
>
> De: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] En nombre de
> Xavier Trilla
> Enviado el: jueves, 9 de marzo de 2017 6:56
> Para: ceph-users@xxxxxxxxxxxxxx
> Asunto:  Posix AIO vs libaio read performance
>
>
>
>
> Hi,
>
>
>
> I’m trying to debut why there is a big difference using POSIX AIO and libaio
> when performing read tests from inside a VM using librbd.
>
>
>
> The results I’m getting using FIO are:
>
>
>
> POSIX AIO Read:
>
>
>
> Type: Random Read - IO Engine: POSIX AIO - Buffered: No - Direct: Yes -
> Block Size: 4KB - Disk Target: /:
>
>
>
> Average: 2.54 MB/s
>
> Average: 632 IOPS
>
>
>
> Libaio Read:
>
>
>
> Type: Random Read - IO Engine: Libaio - Buffered: No - Direct: Yes - Block
> Size: 4KB - Disk Target: /:
>
>
>
> Average: 147.88 MB/s
>
> Average: 36967 IOPS
>
>
>
> When performing writes the differences aren’t so big, because the cluster
> –which is in production right now- is CPU bonded:
>
>
>
> POSIX AIO Write:
>
>
>
> Type: Random Write - IO Engine: POSIX AIO - Buffered: No - Direct: Yes -
> Block Size: 4KB - Disk Target: /:
>
>
>
> Average: 14.87 MB/s
>
> Average: 3713 IOPS
>
>
>
> Libaio Write:
>
>
>
> Type: Random Write - IO Engine: Libaio - Buffered: No - Direct: Yes - Block
> Size: 4KB - Disk Target: /:
>
>
>
> Average: 14.51 MB/s
>
> Average: 3622 IOPS
>
>
>
>
>
> Even if the write results are CPU bonded, as the machines containing the
> OSDs don’t have enough CPU to handle all the IOPS (CPU upgrades are on its
> way) I cannot really understand why I’m seeing so much difference in the
> read tests.
>
>
>
> Some configuration background:
>
>
>
> - Cluster and clients are using Hammer 0.94.90
>
> - It’s a full SSD cluster running over Samsung Enterprise SATA SSDs, with
> all the typical tweaks (Customized ceph.conf, optimized sysctl, etc…)
>
> - Tried QEMU 2.0 and 2.7 – Similar results
>
> - Tried virtio-blk and virtio-scsi – Similar results
>
>
>
> I’ve been reading about POSIX AIO and Libaio, and I can see there are
> several differences on how they work (Like one being user space and the
> other one being kernel) but I don’t really get why Ceph have such problems
> handling POSIX AIO read operations, but not write operation, and how to
> avoid them.
>
>
>
> Right now I’m trying to identify if it’s something wrong with our Ceph
> cluster setup, with Ceph in general or with QEMU (virtio-scsi or virtio-blk
> as both have the same behavior)
>
>
>
> If you would like to try to reproduce the issue here are the two command
> lines I’m using:
>
>
>
> fio --name=randread-posix --output ./test --runtime 60 --ioengine=posixaio
> --buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32
>
> fio --name=randread-libaio --output ./test --runtime 60 --ioengine=libaio
> --buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32
>
>
>
>
>
> If you could shed any light over this I would be really helpful, as right
> now, although I have still some ideas left to try, I’m don’t have much idea
> about why is this happening…
>
>
>
> Thanks!
>
> Xavier
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux