Re: Ceph Crach at sync_thread_timeout after heavy random writes.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 03/25/2013 10:35 AM, Chen, Xiaoxi wrote:

> OK,but my VM didnt crash, it's ceph-osd daemon crashed. So is it safe for me to say the issue I hit is a different issue?(not #3737)

Yes, then it surely is a different issue. Actually you just said ceph
crashed, no mention of an OSD, so it was hard to find out :)

>> Wolfgang
> 
>                        xiaoxi
>>
>> On 03/25/2013 10:15 AM, Chen, Xiaoxi wrote:
>>>
>>>
>>> Hi Wolfgang,
>>>
>>>        Thanks for the reply,but why my problem is related with issue#3737? I cannot find any direct link between them. I didnt turn on qemu cache and my qumu/VM work fine
>>>
>>>
>>>                Xiaoxi
>>>
>>> 在 2013-3-25,17:07,"Wolfgang Hennerbichler" <wolfgang.hennerbichler@xxxxxxxxxxxxxxxx> 写道:
>>>
>>>> Hi,
>>>>
>>>> this could be related to this issue here and has been reported multiple
>>>> times:
>>>>
>>>> http://tracker.ceph.com/issues/3737
>>>>
>>>> In short: They're working on it, they know about it.
>>>>
>>>> Wolfgang
>>>>
>>>> On 03/25/2013 10:01 AM, Chen, Xiaoxi wrote:
>>>>> Hi list,
>>>>>
>>>>>        We have hit and reproduce this issue for several times, ceph
>>>>> will suicide because FileStore: sync_entry timed out after a very heavy
>>>>> random IO on top of the RBD.
>>>>>
>>>>>        My test environment is:
>>>>>
>>>>>                           4 Nodes ceph cluster with 20 HDDs for OSDs
>>>>> and 4 Intel DCS3700 ssds for journal per node, that is 80 spindles in total
>>>>>
>>>>>                           48 VMs spread across 12 Physical nodes, 48
>>>>> RBD attached to the VMs 1:1 via Qemu.
>>>>>
>>>>>                           Ceph @ 0.58
>>>>>
>>>>>                           XFS were used.
>>>>>
>>>>>        I am using Aiostress (something like FIO) to produce random
>>>>> write requests on top of each RBDs.
>>>>>
>>>>>
>>>>>
>>>>>        From Ceph-w , ceph reports a very high Ops (10000+ /s) , but
>>>>> technically , 80 spindles can provide up to 150*80/2=6000 IOPS for 4K
>>>>> random write.
>>>>>
>>>>>        When digging into the code, I found that the OSD write data to
>>>>> Pagecache than returned, although it called ::sync_file_range, but this
>>>>> syscall doesn’t actually sync data to disk when it return,it’s an aync
>>>>> call. So the situation is , the random write will be extremely fast
>>>>> since it only write to journal and pagecache, but once syncing , it will
>>>>> take very long time. The speed gap between journal and OSDs exist, the
>>>>> amount of data that need to be sync keep increasing, and it will
>>>>> certainly exceed 600s.
>>>>>
>>>>>
>>>>>
>>>>>        For more information, I have tried to reproduce this by rados
>>>>> bench,but failed.
>>>>>
>>>>>
>>>>>
>>>>>        Could you please let me know if you need any more informations
>>>>> & have some solutions? Thanks
>>>>>
>>>>>
>>>>>        Xiaoxi
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>>
>>>> -- 
>>>> DI (FH) Wolfgang Hennerbichler
>>>> Software Development
>>>> Unit Advanced Computing Technologies
>>>> RISC Software GmbH
>>>> A company of the Johannes Kepler University Linz
>>>>
>>>> IT-Center
>>>> Softwarepark 35
>>>> 4232 Hagenberg
>>>> Austria
>>>>
>>>> Phone: +43 7236 3343 245
>>>> Fax: +43 7236 3343 250
>>>> wolfgang.hennerbichler@xxxxxxxxxxxxxxxx
>>>> http://www.risc-software.at
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> -- 
>> DI (FH) Wolfgang Hennerbichler
>> Software Development
>> Unit Advanced Computing Technologies
>> RISC Software GmbH
>> A company of the Johannes Kepler University Linz
>>
>> IT-Center
>> Softwarepark 35
>> 4232 Hagenberg
>> Austria
>>
>> Phone: +43 7236 3343 245
>> Fax: +43 7236 3343 250
>> wolfgang.hennerbichler@xxxxxxxxxxxxxxxx
>> http://www.risc-software.at


-- 
DI (FH) Wolfgang Hennerbichler
Software Development
Unit Advanced Computing Technologies
RISC Software GmbH
A company of the Johannes Kepler University Linz

IT-Center
Softwarepark 35
4232 Hagenberg
Austria

Phone: +43 7236 3343 245
Fax: +43 7236 3343 250
wolfgang.hennerbichler@xxxxxxxxxxxxxxxx
http://www.risc-software.at
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux