Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Xiaoxi,

On Mon, 25 Mar 2013, Chen, Xiaoxi wrote:
>          From Ceph-w , ceph reports a very high Ops (10000+ /s) , but
> technically , 80 spindles can provide up to 150*80/2=6000 IOPS for 4K random
> write.
> 
>          When digging into the code, I found that the OSD write data to
> Pagecache than returned, although it called ::sync_file_range, but this
> syscall doesn?t actually sync data to disk when it return,it?s an aync call.
> So the situation is , the random write will be extremely fast since it only
> write to journal and pagecache, but once syncing , it will take very long
> time. The speed gap between journal and OSDs exist, the amount of data that
> need to be sync keep increasing, and it will certainly exceed 600s.

The sync_file_range is only there to push things to disk sooner, so that 
the eventual syncfs(2) takes less time.  When the async flushing is 
enabled, there is a limit to the number of flushes that are in the queue, 
but if it hits the max it just does

    dout(10) << "queue_flusher ep " << sync_epoch << " fd " << fd << " " << off << "~" << len
	     << " qlen " << flusher_queue_len 
	     << " hit flusher_max_fds " << m_filestore_flusher_max_fds
	     << ", skipping async flush" << dendl;

Can you confirm that the filestore is taking this path?  (debug filestore 
= 10 and then reproduce.)

You may want to try

 filestore flusher = false
 filestore sync flush = true

and see if that changes things--it will make the sync_file_range() happen 
inline after the write.

Anyway, it sounds like you may be queueing up so many random writes that 
the sync takes forever.  I've never actually seen that happen, so if we 
can confirm that's what is going on that will be very interesting.

Thanks-
sage


> 
>  
> 
>          For more information, I have tried to reproduce this by rados
> bench,but failed.
> 
>  
> 
>          Could you please let me know if you need any more informations &
> have some solutions? Thanks
> 
>                                                                           
?? ?                                                                          
?? ?                                                                          
?? ?                           Xiaoxi
> 
> 
> 

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux