thanks, Colin as you said, when detect the system average workload is too high, it can from the low-level ext3 writing, because it can not return a result in 10 minutes, bring the COSD process to suicide. we do not use btrfs, we use ext3 instead. does this make any difference? and another phenomen, sometimes the client transfers 4MB data every 5 secs, 2011-07-04 14:35:17.512670 pg v582: 396 pgs: 396 active+clean+degraded; 1844 MB data, 6020 MB used, 124 GB / 137 GB avail; 482/964 degraded (50.000%) 2011-07-04 14:35:22.514094 pg v583: 396 pgs: 396 active+clean+degraded; 1848 MB data, 6024 MB used, 124 GB / 137 GB avail; 483/966 degraded (50.000%) 2011-07-04 14:35:27.513259 pg v584: 396 pgs: 396 active+clean+degraded; 1852 MB data, 6032 MB used, 124 GB / 137 GB avail; 484/968 degraded (50.000%) 2011-07-04 14:35:32.513605 pg v585: 396 pgs: 396 active+clean+degraded; 1856 MB data, 6036 MB used, 124 GB / 137 GB avail; 485/970 degraded (50.000%) 2011-07-04 14:35:37.513930 pg v586: 396 pgs: 396 active+clean+degraded; 1860 MB data, 6040 MB used, 124 GB / 137 GB avail; 486/972 degraded (50.000%) 2011-07-04 14:35:42.514776 pg v587: 396 pgs: 396 active+clean+degraded; 1864 MB data, 6040 MB used, 124 GB / 137 GB avail; 487/974 degraded (50.000%) 2011-07-04 14:35:47.514993 pg v588: 396 pgs: 396 active+clean+degraded; 1868 MB data, 6048 MB used, 124 GB / 137 GB avail; 488/976 degraded (50.000%) but sometime it reach 100MB/s in continous few secs. does this related to the OSD it writes to? thanks ! 2011/7/4 Colin McCabe <cmccabe@xxxxxxxxxxxxxx>: > On Sun, Jul 3, 2011 at 6:42 AM, huang jun <hjwsm1989@xxxxxxxxx> wrote: >> hi,all >> I test ceph 0.30 on linux-2.6.37 recently,after i build the cluster >> bsd12:/# ceph -s >> 2011-07-04 09:37:42.920166 pg v66: 198 pgs: 198 >> active+clean+degraded; 1008 MB data, 11363 MB used, 2986 MB / 15118 MB >> avail; 273/546 degraded (50.000%) >> 2011-07-04 09:37:42.920674 mds e4: 1/1/1 up {0=0=up:active} >> 2011-07-04 09:37:42.920723 osd e2: 1 osds: 1 up, 1 in >> 2011-07-04 09:37:42.920786 log 2011-07-04 09:15:47.239098 osd0 >> 192.168.1.102:6801/7646 73 : [INF] 1.6 scrub ok >> 2011-07-04 09:37:42.920860 mon e1: 1 mons at {0=192.168.1.102:6789/0} >> >> then, mount ceph fs on /mnt >> bsd12:/mnt/dd# dd if=/dev/zero of=sa bs=4M count=200 >> and get nothing, is dead >> >> during the writing, use sar to monitor the eth0, >> but find there isn't any data transfered at all, like: >> 09:31:29 PM IFACE rxpck/s txpck/s rxkB/s txkB/s >> rxcmp/s txcmp/s rxmcst/s >> 09:31:30 PM lo 0.00 0.00 0.00 0.00 >> 0.00 0.00 0.00 >> 09:31:30 PM eth0 0.00 0.00 0.00 0.00 >> 0.00 0.00 0.00 >> 09:31:30 PM eth1 4.00 2.00 0.41 0.26 >> 0.00 0.00 0.00 >> >> it seems OSD didn't do write, so result in client can not go on writing. >> from the osd log, the scrub loadavg is very high: >> 2011-07-04 09:25:52.163742 7f0b56f2c700 osd0 2 tick >> 2011-07-04 09:25:52.163788 7f0b56f2c700 osd0 2 scrub_should_schedule >> loadavg 2 >= max 0.5 = no, load too high >> 2011-07-04 09:25:52.163804 7f0b56f2c700 osd0 2 do_mon_report >> 2011-07-04 09:25:52.163819 7f0b56f2c700 osd0 2 send_alive up_thru >> currently 0 want 0 >> 2011-07-04 09:25:52.163833 7f0b56f2c700 osd0 2 send_pg_stats >> 2011-07-04 09:25:52.782851 7f0b4d517700 osd0 2 update_osd_stat >> osd_stat(11363 MB used, 2986 MB avail, 15118 MB total, peers []/[]) >> 2011-07-04 09:25:52.782887 7f0b4d517700 osd0 2 heartbeat: >> stat(2011-07-04 09:25:52.782813 oprate=0.339098 qlen=0 recent_qlen=0 >> rdlat=0 / 0 fshedin=0) >> 2011-07-04 09:25:52.782902 7f0b4d517700 osd0 2 heartbeat: >> osd_stat(11363 MB used, 2986 MB avail, 15118 MB total, peers []/[]) >> 2011-07-04 09:25:53.012934 7f0b51f22700 FileStore: sync_entry timed >> out after 600 seconds. >> 2011-07-04 09:25:53.012969 1: (SafeTimer::timer_thread()+0x311) [0x6028d1] >> 2011-07-04 09:25:53.012976 2: (SafeTimerThread::entry()+0xd) [0x604f3d] >> 2011-07-04 09:25:53.012985 3: (()+0x68ba) [0x7f0b5bba78ba] >> 2011-07-04 09:25:53.012992 4: (clone()+0x6d) [0x7f0b5a80302d] >> 2011-07-04 09:25:53.012997 *** Caught signal (Aborted) ** >> in thread 0x7f0b51f22700 >> >> i'd like to know, does scrub workload too high results in OSD abort? >> and why we design scrub here ? to protect the consistence of PG? > > That error only occurs when 10 minutes go past without any activity. > It seems unlikely that any amount of scrubbing could cause the > filesystem to pause for that long. Are there any backtraces from btrfs > in the syslog? Also, you might try mounting the btrfs filesystem > yourself to see if it works for you. > > regards, > Colin > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html