I'm using xfs on the rbd disks. They are between 1 and 10TB in size. Am 20.04.2015 um 14:32 schrieb Nick Fisk: > Ah ok, good point > > What FS are you using on the RBD? > >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >> Christian Eichelmann >> Sent: 20 April 2015 13:16 >> To: Nick Fisk; ceph-users@xxxxxxxxxxxxxx >> Subject: Re: 100% IO Wait with CEPH RBD and RSYNC >> >> Hi Nick, >> >> I forgot to mention that I was also trying a workaround using the userland >> (rbd-fuse). The behaviour was exactly the same (worked fine for several >> hours, testing parallel reading and writing, then IO Wait and system load >> increased). >> >> This is why I don't think it is an issue with the rbd kernel module. >> >> Regards, >> Christian >> >> Am 20.04.2015 um 11:37 schrieb Nick Fisk: >>> Hi Christian, >>> >>> A very non-technical answer but as the problem seems related to the >>> RBD client it might be worth trying the latest Kernel if possible. The >>> RBD client is Kernel based and so there may be a fix which might stop >>> this from happening. >>> >>> Nick >>> >>>> -----Original Message----- >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >>>> Of Christian Eichelmann >>>> Sent: 20 April 2015 08:29 >>>> To: ceph-users@xxxxxxxxxxxxxx >>>> Subject: 100% IO Wait with CEPH RBD and RSYNC >>>> >>>> Hi Ceph-Users! >>>> >>>> We currently have a problem where I am not sure if the it has it's >>>> cause >>> in >>>> Ceph or something else. First, some information about our ceph-setup: >>>> >>>> * ceph version 0.87.1 >>>> * 5 MON >>>> * 12 OSD with 60x2TB each >>>> * 2 RSYNC Gateways with 2x10G Ethernet (Kernel: 3.16.3-2~bpo70+1, >>>> Debian >>>> Wheezy) >>>> >>>> Our cluster is mainly used to store Log-Files from numerous servers >>>> via >>> RSync >>>> and make them available via RSync as well. Since about two weeks we >>>> have a very strange behaviour and our RSync Gateways (they just map >>>> several rbd devices and "export" them via rsyncd): The IO Wait on the >>>> systems are increasing untill some of the cores getting stuck with an > IO >> Wait of 100%. >>>> RSync processes become zombies (defunct) and/or can not be killed >>>> even with SIGKILL. After the system has reached a load of about 1400, >>>> it >>> becomes >>>> totally unresponsive and the only way to "fix" the problem is to >>>> reboot >>> the >>>> system. >>>> >>>> I was trying to manually reproduce the problem by simultainously >>>> reading and writing from several machine, but the problem didn't > appear. >>>> >>>> I have no idea where the error can be. I was doing a ceph tell osd.* >>>> bench during the problem and all osds where having normal benchmark >>>> results. Has anyone an idea how this can happen? If you need any more >>>> informations, please let me know. >>>> >>>> Regards, >>>> Christian >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> >> >> >> -- >> Christian Eichelmann >> Systemadministrator >> >> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting >> Brauerstraße 48 · DE-76135 Karlsruhe >> Telefon: +49 721 91374-8026 >> christian.eichelmann@xxxxxxxx >> >> Amtsgericht Montabaur / HRB 6484 >> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert >> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen >> Aufsichtsratsvorsitzender: Michael Scheeren >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- Christian Eichelmann Systemadministrator 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelmann@xxxxxxxx Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com