Re: rbd create error with 0.26

Simon Tian <aixt2006@xxxxxxxxx> · Fri, 13 May 2011 15:45:41 +0800

Hi guys,

     I did some test, 4 scenario:
(1) 5 osds in 2 hosts, ceph-0.26      one of the osds will core dump
when create several rbd.
(2) 4 osds in 4 hosts, ceph-0.26      OK
(3) 5 osds in 2 hosts, ceph-0.27.1   OK
(4) 4 osds in 4 hosts, ceph-0.27.1   OK

BTW, I have 2 question:
1.   In these scenario, after I execute "cclass -a" and "ceph class
activate rbd 1.3",  I need to wait for several second before create
rbd, otherwise, "librbd: failed to assign a block name for image" will
come out. Is this all right?

2.   I run some test with a modified testlibrbd.c, code add like:

gettimeofday(&tv1, NULL);
  for (i = 0; i < num_test; i++)
    write_test_data(image, test_data, TEST_IO_SIZE * i, TEST_IO_SIZE);
gettimeofday(&tv2, NULL);
t1 = tv2.tv_sec-tv1.tv_sec;
temp = (float)t1 + (tv2.tv_usec-tv1.tv_usec)/1000000.0;
speed = 1.0*TEST_IO_SIZE*num_test/temp/1024/1024;
printf("time used: temp=%.3f\n", temp);
printf("write speed: %.2f MB/s\n", speed);

The result I got is so slowly:
time used: temp=46.611
write speed: 0.21 MB/s
time used: temp=14.706
read speed: 0.68 MB/s
time used: temp=45.453
aio write speed: 0.22 MB/s
time used: temp=14.759
aio read speed: 0.68 MB/s

But while the test, some cosd process is running with high CPU usage.

Thx!
Simon

2011/5/10 Yehuda Sadeh Weinraub <yehudasa@xxxxxxxxx>:
> On Tue, May 10, 2011 at 7:15 AM, Simon Tian <aixt2006@xxxxxxxxx> wrote:
>> Hi,
>>
>> Â ÂAs you said, one of the osds crashed:
>> ================= log ========================
>> 2011-05-10 21:46:38.990311 4bc90940 osd2 8 pg[3.13a( v 8'1 (0'0,8'1]
>> n=1 ec=2 les=6 5/5/4) [2,3] r=0 mlcod 0'0 active+clean]
>> oi.user_version=8'2 is_modify=0
>> 2011-05-10 21:46:38.990386 4bc90940 osd2 8 pg[3.13a( v 8'1 (0'0,8'1]
>> n=1 ec=2 les=6 5/5/4) [2,3] r=0 mlcod 0'0 active+clean]
>> oi.user_version=8'2 is_modify=1
>> *** Caught signal (Segmentation fault) **
>> Âin thread 0x45382940
>> =========================================
>>
>> I tried again, this time, i done "rbd create foo --size 1024"
>> successfully, but when I run the code of testlibrbd.c, one of the osds
>> crash again:
>> ================= log ========================
>> 2011-05-10 22:08:20.008871 4c115940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean
>> snaptrimq=[1~1]] dump_watchers testimg.rbd/head testimg.rbd/head(9'4
>> client4107.0:14 wrlock_by=unknown0.0:0)
>> 2011-05-10 22:08:20.008903 4c115940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean
>> snaptrimq=[1~1]] Â* obc->watcher: client4107 session=0xc80990
>> 2011-05-10 22:08:20.008925 4c115940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean
>> snaptrimq=[1~1]] Â* oi->watcher: client4107 cookie=2
>> 2011-05-10 22:08:20.009232 4b914940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean]
>> oi.user_version=10'5 is_modify=1
>> 2011-05-10 22:08:20.009267 4b914940 expires 2011-05-10 23:08:19.890032
>> now 2011-05-10 22:08:20.009260
>> 2011-05-10 22:08:20.009284 napshots_list
>> 2011-05-10 22:08:20.009307 4b914940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean]
>> oi.user_version=10'5 is_modify=0
>> 2011-05-10 22:08:20.009375 4b914940 osd3 10 pg[4.1( v 9'4 (9'2,9'4]
>> n=1 ec=9 les=9 9/9/9) [3,0] r=0 mlcod 9'3 active+clean]
>> oi.user_version=10'5 is_modify=1
>> *** Caught signal (Segmentation fault) **
>> Âin thread 0x4eb1c940
>> =========================================
>>
> Can you by any chance get backtrace for that crash (gdb cosd core;
> bt)? You might need to have the debug packages installed.
> Also, note that you're not running the latest version so you might be
> hitting something that was already fixed (not that I remember anything
> specific, but it might be worth a try).
>
> Thanks,
> Yehuda
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html