Re: 100% IO Wait with CEPH RBD and RSYNC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm using xfs on the rbd disks.
They are between 1 and 10TB in size.

Am 20.04.2015 um 14:32 schrieb Nick Fisk:
> Ah ok, good point
> 
> What FS are you using on the RBD?
> 
>> -----Original Message-----
>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
>> Christian Eichelmann
>> Sent: 20 April 2015 13:16
>> To: Nick Fisk; ceph-users@xxxxxxxxxxxxxx
>> Subject: Re:  100% IO Wait with CEPH RBD and RSYNC
>>
>> Hi Nick,
>>
>> I forgot to mention that I was also trying a workaround using the userland
>> (rbd-fuse). The behaviour was exactly the same (worked fine for several
>> hours, testing parallel reading and writing, then IO Wait and system load
>> increased).
>>
>> This is why I don't think it is an issue with the rbd kernel module.
>>
>> Regards,
>> Christian
>>
>> Am 20.04.2015 um 11:37 schrieb Nick Fisk:
>>> Hi Christian,
>>>
>>> A very non-technical answer but as the problem seems related to the
>>> RBD client it might be worth trying the latest Kernel if possible. The
>>> RBD client is Kernel based and so there may be a fix which might stop
>>> this from happening.
>>>
>>> Nick
>>>
>>>> -----Original Message-----
>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
>>>> Of Christian Eichelmann
>>>> Sent: 20 April 2015 08:29
>>>> To: ceph-users@xxxxxxxxxxxxxx
>>>> Subject:  100% IO Wait with CEPH RBD and RSYNC
>>>>
>>>> Hi Ceph-Users!
>>>>
>>>> We currently have a problem where I am not sure if the it has it's
>>>> cause
>>> in
>>>> Ceph or something else. First, some information about our ceph-setup:
>>>>
>>>> * ceph version 0.87.1
>>>> * 5 MON
>>>> * 12 OSD with 60x2TB each
>>>> * 2 RSYNC Gateways with 2x10G Ethernet (Kernel: 3.16.3-2~bpo70+1,
>>>> Debian
>>>> Wheezy)
>>>>
>>>> Our cluster is mainly used to store Log-Files from numerous servers
>>>> via
>>> RSync
>>>> and make them available via RSync as well. Since about two weeks we
>>>> have a very strange behaviour and our RSync Gateways (they just map
>>>> several rbd devices and "export" them via rsyncd): The IO Wait on the
>>>> systems are increasing untill some of the cores getting stuck with an
> IO
>> Wait of 100%.
>>>> RSync processes become zombies (defunct) and/or can not be killed
>>>> even with SIGKILL. After the system has reached a load of about 1400,
>>>> it
>>> becomes
>>>> totally unresponsive and the only way to "fix" the problem is to
>>>> reboot
>>> the
>>>> system.
>>>>
>>>> I was trying to manually reproduce the problem by simultainously
>>>> reading and writing from several machine, but the problem didn't
> appear.
>>>>
>>>> I have no idea where the error can be. I was doing a ceph tell osd.*
>>>> bench during the problem and all osds where having normal benchmark
>>>> results. Has anyone an idea how this can happen? If you need any more
>>>> informations, please let me know.
>>>>
>>>> Regards,
>>>> Christian
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Christian Eichelmann
>> Systemadministrator
>>
>> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
>> Brauerstraße 48 · DE-76135 Karlsruhe
>> Telefon: +49 721 91374-8026
>> christian.eichelmann@xxxxxxxx
>>
>> Amtsgericht Montabaur / HRB 6484
>> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
>> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
>> Aufsichtsratsvorsitzender: Michael Scheeren
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 


-- 
Christian Eichelmann
Systemadministrator

1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
christian.eichelmann@xxxxxxxx

Amtsgericht Montabaur / HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux