Re: 100% IO Wait with CEPH RBD and RSYNC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Onur,

actual 333350, ideal 330128, fragmentation factor 0.97%

so fragmentation is not an issue here.

Regards,
Christian

Am 20.04.2015 um 16:41 schrieb Onur BEKTAS:
> Hi,
> 
> Check   xfs fregmentation factor for rbd disks i.e.
> 
> xfs_db -c frag -r /dev/sdX
> 
> if it is high, try defrag
> 
> xfs_fsr /dev/sdX
> 
> 
> Regards,
> 
> Onur.
> 
> 
> On 4/20/2015 4:41 PM, Nick Fisk wrote:
>> If possible, it might be worth trying an EXT4 formatted RBD. I've had
>> problems with XFS hanging in the past on simple LVM volumes and never
>> really
>> got to the bottom of it, whereas the same volumes formatted with EXT4 has
>> been running for years without a problem.
>>
>>> -----Original Message-----
>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
>>> Christian Eichelmann
>>> Sent: 20 April 2015 14:41
>>> To: Nick Fisk; ceph-users@xxxxxxxxxxxxxx
>>> Subject: Re:  100% IO Wait with CEPH RBD and RSYNC
>>>
>>> I'm using xfs on the rbd disks.
>>> They are between 1 and 10TB in size.
>>>
>>> Am 20.04.2015 um 14:32 schrieb Nick Fisk:
>>>> Ah ok, good point
>>>>
>>>> What FS are you using on the RBD?
>>>>
>>>>> -----Original Message-----
>>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
>>>>> Of Christian Eichelmann
>>>>> Sent: 20 April 2015 13:16
>>>>> To: Nick Fisk; ceph-users@xxxxxxxxxxxxxx
>>>>> Subject: Re:  100% IO Wait with CEPH RBD and RSYNC
>>>>>
>>>>> Hi Nick,
>>>>>
>>>>> I forgot to mention that I was also trying a workaround using the
>>>>> userland (rbd-fuse). The behaviour was exactly the same (worked fine
>>>>> for several hours, testing parallel reading and writing, then IO Wait
>>>>> and system load increased).
>>>>>
>>>>> This is why I don't think it is an issue with the rbd kernel module.
>>>>>
>>>>> Regards,
>>>>> Christian
>>>>>
>>>>> Am 20.04.2015 um 11:37 schrieb Nick Fisk:
>>>>>> Hi Christian,
>>>>>>
>>>>>> A very non-technical answer but as the problem seems related to the
>>>>>> RBD client it might be worth trying the latest Kernel if possible.
>>>>>> The RBD client is Kernel based and so there may be a fix which might
>>>>>> stop this from happening.
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On
>>>>>>> Behalf Of Christian Eichelmann
>>>>>>> Sent: 20 April 2015 08:29
>>>>>>> To: ceph-users@xxxxxxxxxxxxxx
>>>>>>> Subject:  100% IO Wait with CEPH RBD and RSYNC
>>>>>>>
>>>>>>> Hi Ceph-Users!
>>>>>>>
>>>>>>> We currently have a problem where I am not sure if the it has it's
>>>>>>> cause
>>>>>> in
>>>>>>> Ceph or something else. First, some information about our
>>>>>>> ceph-setup:
>>>>>>>
>>>>>>> * ceph version 0.87.1
>>>>>>> * 5 MON
>>>>>>> * 12 OSD with 60x2TB each
>>>>>>> * 2 RSYNC Gateways with 2x10G Ethernet (Kernel: 3.16.3-2~bpo70+1,
>>>>>>> Debian
>>>>>>> Wheezy)
>>>>>>>
>>>>>>> Our cluster is mainly used to store Log-Files from numerous servers
>>>>>>> via
>>>>>> RSync
>>>>>>> and make them available via RSync as well. Since about two weeks we
>>>>>>> have a very strange behaviour and our RSync Gateways (they just map
>>>>>>> several rbd devices and "export" them via rsyncd): The IO Wait on
>>>>>>> the systems are increasing untill some of the cores getting stuck
>>>>>>> with an
>>>> IO
>>>>> Wait of 100%.
>>>>>>> RSync processes become zombies (defunct) and/or can not be killed
>>>>>>> even with SIGKILL. After the system has reached a load of about
>>>>>>> 1400, it
>>>>>> becomes
>>>>>>> totally unresponsive and the only way to "fix" the problem is to
>>>>>>> reboot
>>>>>> the
>>>>>>> system.
>>>>>>>
>>>>>>> I was trying to manually reproduce the problem by simultainously
>>>>>>> reading and writing from several machine, but the problem didn't
>>>> appear.
>>>>>>> I have no idea where the error can be. I was doing a ceph tell
>>>>>>> osd.* bench during the problem and all osds where having normal
>>>>>>> benchmark results. Has anyone an idea how this can happen? If you
>>>>>>> need any more informations, please let me know.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> Christian Eichelmann
>>>>> Systemadministrator
>>>>>
>>>>> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
>>>>> Brauerstraße 48 · DE-76135 Karlsruhe
>>>>> Telefon: +49 721 91374-8026
>>>>> christian.eichelmann@xxxxxxxx
>>>>>
>>>>> Amtsgericht Montabaur / HRB 6484
>>>>> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
>>>>> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan
>>>>> Oetjen
>>>>> Aufsichtsratsvorsitzender: Michael Scheeren
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>>
>>> -- 
>>> Christian Eichelmann
>>> Systemadministrator
>>>
>>> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
>>> Brauerstraße 48 · DE-76135 Karlsruhe
>>> Telefon: +49 721 91374-8026
>>> christian.eichelmann@xxxxxxxx
>>>
>>> Amtsgericht Montabaur / HRB 6484
>>> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
>>> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan
>>> Oetjen
>>> Aufsichtsratsvorsitzender: Michael Scheeren
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Christian Eichelmann
Systemadministrator

1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
christian.eichelmann@xxxxxxxx

Amtsgericht Montabaur / HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux