Re: scrub loadavg too high

Colin McCabe <cmccabe@xxxxxxxxxxxxxx> · Sun, 3 Jul 2011 22:49:45 -0700

On Sun, Jul 3, 2011 at 6:42 AM, huang jun <hjwsm1989@xxxxxxxxx> wrote:
> hi,all
> I test ceph 0.30 on linux-2.6.37 recently,after i build the cluster
> bsd12:/# ceph -s
> 2011-07-04 09:37:42.920166    pg v66: 198 pgs: 198
> active+clean+degraded; 1008 MB data, 11363 MB used, 2986 MB / 15118 MB
> avail; 273/546 degraded (50.000%)
> 2011-07-04 09:37:42.920674   mds e4: 1/1/1 up {0=0=up:active}
> 2011-07-04 09:37:42.920723   osd e2: 1 osds: 1 up, 1 in
> 2011-07-04 09:37:42.920786   log 2011-07-04 09:15:47.239098 osd0
> 192.168.1.102:6801/7646 73 : [INF] 1.6 scrub ok
> 2011-07-04 09:37:42.920860   mon e1: 1 mons at {0=192.168.1.102:6789/0}
>
> then, mount ceph fs on /mnt
> bsd12:/mnt/dd# dd if=/dev/zero of=sa bs=4M count=200
> and get nothing, is dead
>
> during the writing, use sar to monitor the eth0,
> but find there isn't any data transfered at all, like:
> 09:31:29 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s
> rxcmp/s   txcmp/s  rxmcst/s
> 09:31:30 PM        lo      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> 09:31:30 PM      eth0      0.00      0.00      0.00      0.00
> 0.00      0.00      0.00
> 09:31:30 PM      eth1      4.00      2.00      0.41      0.26
> 0.00      0.00      0.00
>
> it seems OSD didn't do write, so result in client can not go on writing.
> from the osd log, the scrub loadavg is very high:
> 2011-07-04 09:25:52.163742 7f0b56f2c700 osd0 2 tick
> 2011-07-04 09:25:52.163788 7f0b56f2c700 osd0 2 scrub_should_schedule
> loadavg 2 >= max 0.5 = no, load too high
> 2011-07-04 09:25:52.163804 7f0b56f2c700 osd0 2 do_mon_report
> 2011-07-04 09:25:52.163819 7f0b56f2c700 osd0 2 send_alive up_thru
> currently 0 want 0
> 2011-07-04 09:25:52.163833 7f0b56f2c700 osd0 2 send_pg_stats
> 2011-07-04 09:25:52.782851 7f0b4d517700 osd0 2 update_osd_stat
> osd_stat(11363 MB used, 2986 MB avail, 15118 MB total, peers []/[])
> 2011-07-04 09:25:52.782887 7f0b4d517700 osd0 2 heartbeat:
> stat(2011-07-04 09:25:52.782813 oprate=0.339098 qlen=0 recent_qlen=0
> rdlat=0 / 0 fshedin=0)
> 2011-07-04 09:25:52.782902 7f0b4d517700 osd0 2 heartbeat:
> osd_stat(11363 MB used, 2986 MB avail, 15118 MB total, peers []/[])
> 2011-07-04 09:25:53.012934 7f0b51f22700 FileStore: sync_entry timed
> out after 600 seconds.
> 2011-07-04 09:25:53.012969 1: (SafeTimer::timer_thread()+0x311) [0x6028d1]
> 2011-07-04 09:25:53.012976 2: (SafeTimerThread::entry()+0xd) [0x604f3d]
> 2011-07-04 09:25:53.012985 3: (()+0x68ba) [0x7f0b5bba78ba]
> 2011-07-04 09:25:53.012992 4: (clone()+0x6d) [0x7f0b5a80302d]
> 2011-07-04 09:25:53.012997 *** Caught signal (Aborted) **
>  in thread 0x7f0b51f22700
>
> i'd like to know, does scrub workload too high results in OSD abort?
> and why we design scrub here ? to protect the consistence of PG?

That error only occurs when 10 minutes go past without any activity.
It seems unlikely that any amount of scrubbing could cause the
filesystem to pause for that long. Are there any backtraces from btrfs
in the syslog? Also, you might try mounting the btrfs filesystem
yourself to see if it works for you.

regards,
Colin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html