Re: Ceph Crach at sync_thread_timeout after heavy random writes.

"Chen, Xiaoxi" <xiaoxi.chen@xxxxxxxxx> · Mon, 25 Mar 2013 09:35:56 +0000



Hi,

在 2013-3-25，17:30，"Wolfgang Hennerbichler" <wolfgang.hennerbichler@xxxxxxxxxxxxxxxx> 写道：

> Hi Xiaoxi,
> 
> sorry, I thought you were testing within VMs and caching turned on (I
> assumed, you didn't tell us if you really did use your benchmark within
> vms and if not, how you tested rbd outside of VMs).
Yes,I really testing within VMs
> It just triggered an alarm in me because we had also experienced issues
> with benchmarking within a VM (it didn't crash but responded extremely
> slow).
> 
OK,but my VM didnt crash, it's ceph-osd daemon crashed. So is it safe for me to say the issue I hit is a different issue?(not #3737)
         
> Wolfgang

                       xiaoxi
> 
> On 03/25/2013 10:15 AM, Chen, Xiaoxi wrote:
>> 
>> 
>> Hi Wolfgang，
>> 
>>        Thanks for the reply，but why my problem is related with issue#3737？ I cannot find any direct link between them. I didnt turn on qemu cache and my qumu/VM work fine
>> 
>> 
>>                Xiaoxi
>> 
>> 在 2013-3-25，17:07，"Wolfgang Hennerbichler" <wolfgang.hennerbichler@xxxxxxxxxxxxxxxx> 写道：
>> 
>>> Hi,
>>> 
>>> this could be related to this issue here and has been reported multiple
>>> times:
>>> 
>>> http://tracker.ceph.com/issues/3737
>>> 
>>> In short: They're working on it, they know about it.
>>> 
>>> Wolfgang
>>> 
>>> On 03/25/2013 10:01 AM, Chen, Xiaoxi wrote:
>>>> Hi list,
>>>> 
>>>>        We have hit and reproduce this issue for several times, ceph
>>>> will suicide because FileStore: sync_entry timed out after a very heavy
>>>> random IO on top of the RBD.
>>>> 
>>>>        My test environment is:
>>>> 
>>>>                           4 Nodes ceph cluster with 20 HDDs for OSDs
>>>> and 4 Intel DCS3700 ssds for journal per node, that is 80 spindles in total
>>>> 
>>>>                           48 VMs spread across 12 Physical nodes, 48
>>>> RBD attached to the VMs 1:1 via Qemu.
>>>> 
>>>>                           Ceph @ 0.58
>>>> 
>>>>                           XFS were used.
>>>> 
>>>>        I am using Aiostress (something like FIO) to produce random
>>>> write requests on top of each RBDs.
>>>> 
>>>> 
>>>> 
>>>>        From Ceph-w , ceph reports a very high Ops (10000+ /s) , but
>>>> technically , 80 spindles can provide up to 150*80/2=6000 IOPS for 4K
>>>> random write.
>>>> 
>>>>        When digging into the code, I found that the OSD write data to
>>>> Pagecache than returned, although it called ::sync_file_range, but this
>>>> syscall doesn’t actually sync data to disk when it return,it’s an aync
>>>> call. So the situation is , the random write will be extremely fast
>>>> since it only write to journal and pagecache, but once syncing , it will
>>>> take very long time. The speed gap between journal and OSDs exist, the
>>>> amount of data that need to be sync keep increasing, and it will
>>>> certainly exceed 600s.
>>>> 
>>>> 
>>>> 
>>>>        For more information, I have tried to reproduce this by rados
>>>> bench,but failed.
>>>> 
>>>> 
>>>> 
>>>>        Could you please let me know if you need any more informations
>>>> & have some solutions? Thanks
>>>> 
>>>> 
>>>>        Xiaoxi
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>> 
>>> 
>>> -- 
>>> DI (FH) Wolfgang Hennerbichler
>>> Software Development
>>> Unit Advanced Computing Technologies
>>> RISC Software GmbH
>>> A company of the Johannes Kepler University Linz
>>> 
>>> IT-Center
>>> Softwarepark 35
>>> 4232 Hagenberg
>>> Austria
>>> 
>>> Phone: +43 7236 3343 245
>>> Fax: +43 7236 3343 250
>>> wolfgang.hennerbichler@xxxxxxxxxxxxxxxx
>>> http://www.risc-software.at
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> -- 
> DI (FH) Wolfgang Hennerbichler
> Software Development
> Unit Advanced Computing Technologies
> RISC Software GmbH
> A company of the Johannes Kepler University Linz
> 
> IT-Center
> Softwarepark 35
> 4232 Hagenberg
> Austria
> 
> Phone: +43 7236 3343 245
> Fax: +43 7236 3343 250
> wolfgang.hennerbichler@xxxxxxxxxxxxxxxx
> http://www.risc-software.at
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com