Re: pg 0.xxxx on [] is laggy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Sage,

It actually works better when removing the old deb pacakge;)... thanks!

The "pg 0.xxxx on [] is laggy" messages have also now disappeared.

Here is, just for information, the obtained two processes dbench
output (which is not so bad given the vm the system in running into):
Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX       1632     8.835   213.966
 Close           1237     0.377   157.360
 Rename            73    24.290   195.969
 Unlink           291     6.514   244.166
 Qpathinfo       1455     4.544   237.949
 Qfileinfo        311     0.233     5.723
 Qfsinfo          270     1.854   110.355
 Sfileinfo        168     3.090    34.291
 Find             552    11.182   192.610
 WriteX          1076     4.916   409.798
 ReadX           2437     0.408    61.635
 LockX              4     0.011     0.017
 UnlockX            4     0.006     0.009
 Flush            140     8.630   175.423

Throughput 2.88849 MB/sec  2 clients  2 procs  max_latency=409.805 ms

I however get the a bunch of metadata sync warnings on the cfuse side
"10.06.30_13:16:20.556187 7f2bc5817720 client4100 fsync - not syncing
metadata yet.. implement me"


On a different topic (sorry for the mix-up), pushing the journal on a
seperate partition give me the following crash:

--- debian-vm1# /usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap
/tmp/monmap.1924 -i 0 --mkfs --osd-data /data/osd0
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
os/FileJournal.cc: In function 'void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)':
os/FileJournal.cc:503: FAILED assert((bl.length() & ~ceph::_page_mask) == 0)
 1: (FileJournal::do_write(ceph::buffer::list&)+0x2a0) [0x55adc0]
 2: (FileJournal::write_thread_entry()+0x1fa) [0x55db4a]
 3: (FileJournal::Writer::entry()+0xd) [0x54f41d]
 4: (Thread::_entry_func(void*)+0x7) [0x46bac7]
 5: (()+0x68ba) [0x7f5dfce098ba]
 6: (clone()+0x6d) [0x7f5dfc02401d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion*'
Aborted (core dumped)

root@debian-vm1:~# gdb /usr/local/bin/cosd -c core
(gdb) bt full
#0  0x00007f5dfbf87175 in raise () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5dfbf89f80 in abort () from /lib/libc.so.6
No symbol table info available.
#2  0x00007f5dfc81add5 in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib/libstdc++.so.6
No symbol table info available.
#3  0x00007f5dfc819176 in ?? () from /usr/lib/libstdc++.so.6
No symbol table info available.
#4  0x00007f5dfc8191a3 in std::terminate() () from /usr/lib/libstdc++.so.6
No symbol table info available.
#5  0x00007f5dfc81929e in __cxa_throw () from /usr/lib/libstdc++.so.6
No symbol table info available.
#6  0x00000000005abb38 in ceph::__ceph_assert_fail (assertion=0x5e4ed8
"(bl.length() & ~ceph::_page_mask) == 0", file=<value optimized out>,
line=503,
    func=<value optimized out>) at common/assert.cc:30
No locals.
#7  0x0000000000557c60 in FileJournal::write_bl (this=0x24fd720,
pos=@0x7f5dfbd3cd68, bl=...) at os/FileJournal.cc:503
        __PRETTY_FUNCTION__ = "void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)"
        err = <value optimized out>
#8  0x000000000055adc0 in FileJournal::do_write (this=0x24fd720,
bl=...) at os/FileJournal.cc:568
        __PRETTY_FUNCTION__ = "void FileJournal::do_write(ceph::bufferlist&)"
        lat = {tv = {tv_sec = 4224962000, tv_usec = 32605}}
        hbp = {_raw = 0x2500450, _off = 0, _len = 1024}
        pos = 1024
        split = <value optimized out>
#9  0x000000000055db4a in FileJournal::write_thread_entry
(this=0x24fd720) at os/FileJournal.cc:657
        orig_ops = 1
        bl = {_buffers = {<std::_List_base<ceph::buffer::ptr,
std::allocator<ceph::buffer::ptr> >> = {
              _M_impl =
{<std::allocator<std::_List_node<ceph::buffer::ptr> >> =
{<__gnu_cxx::new_allocator<std::_List_node<ceph::buffer::ptr> >> =
{<No data fields>}, <No data fields>}, _M_node = {_M_next = 0x2500360,
_M_prev = 0x2500360}}}, <No data fields>}, _len = 5120, append_buffer
= {_raw = 0x25002d0, _off = 0, _len = 80}, last_p = {
            bl = 0x7f5dfbd3ce10, ls = 0x7f5dfbd3ce10, off = 0, p =
{_M_node = 0x7f5dfbd3ce10}, p_off = 0}}
        orig_bytes = 206
        r = 0
        __PRETTY_FUNCTION__ = "void FileJournal::write_thread_entry()"
#10 0x000000000054f41d in FileJournal::Writer::entry (this=<value
optimized out>) at os/FileJournal.h:140
No locals.
#11 0x000000000046bac7 in Thread::_entry_func (arg=0x800) at
./common/Thread.h:39
        r = <value optimized out>
#12 0x00007f5dfce098ba in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#13 0x00007f5dfc02401d in clone () from /lib/libc.so.6
No symbol table info available.
#14 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)

I don't have so much time to investigate right now but I'll give a try
later today.

Sebastien

[global]
       pid file = /var/run/ceph/$name.pid
[mon]
       debug ms = 1
       mon data = /data/mon$id
[mon0]
       host = debian-vm1
       mon addr = 127.0.0.1:6789
[mds]
       debug ms = 1
[mds0]
       host = debian-vm1
[osd]
       debug ms = 1
       sudo = true
       osd data = /data/osd$id
       osd journal = /ceph-journal/osd$id
       osd journal size = 256
       filestore journal writeahead = true
[osd0]
       host = debian-vm1


with
/dev/sda6 on /ceph-journal type ext3 (rw,user_xattr)
/dev/sda7 on /data type ext3 (rw,user_xattr)


2010/6/29 Sage Weil <sage@xxxxxxxxxxxx>:
> Hi Sebastien,
>
> On Tue, 29 Jun 2010, Sébastien Paolacci wrote:
>> First of all, many thanks for this wonderful piece of software that
>> actually looks very promising. It's a segment that imho definitely
>> lack of credible open source alternatives to the (sometimes infamous
>> and inefficient) proprietary systems.
>
> Thanks!
>
>> So I've just pulled the unstable branch from last Sunday and are few
>> outcomes (local vm, sorry for that, but as for a first try...):
>>
>> - Build: transparent, which is actually not so common for an unstable
>> branch of a said to not be mature project ;). Thanks.
>>
>> - Config: it's a bit difficult to understand the real meaning of all
>> the available options (debian and suse dedicated pages are however
>> very helpful), so documentation is sparse, as expected, and I should
>> have start by reading the code anyway (so my bad at the end).
>>
>> - First setup attempt left me with a "mon fs missing 'whoami'.. did
>> you run mkcephfs?" (see end of mail) I just echoed a "0" in
>> "/data/mom0/whoami" and it did startup.
>
> It looks like when you ran /etc/init.d/ceph is tried to start
> /usr/bin/cmon, although I notice lots of /usr/local in your mkcephfs
> output.  Do you by chance build from source and then 'make install', and
> then also install a .deb or .rpm?  The "mon fs missing 'whoami'" is an old
> error message that no longer appears in the 'unstable' branch, so there is
> an old binary or old source involved somewhere.
>
>> - cfuse -m 127.0.01:5678/ /mnt/ceph is eating all my memory and
>> crashes with a bad_alloc
>
> Fixed this.. there was an ip address parsing error.  The trailing '/'
> shouldn't be there, and wasn't getting ignored.
>
>> - cfuse /mnt/ceph is however working as expected. Creating files and
>> browsing /mnt/ceph content provide with the desired result, dbench -D
>> /mnt/ceph/ -t 10 2 however seems to endless wait for completion. On
>> the cfuse side, I'm getting (what seems to be) an endless serie of "pg
>> 0.xxx on [] is laggy"
>
> That means the OSD isn't responding for some request(s).  Did cosd start?
> Does a 'ceph -s' show some osds are 'up'?  If cosd crashed, the output log
> or gdb backtrack would be helpful.
>
>> root@debian-vm1:/home/seb# cat /etc/ceph/ceph.conf | grep -v '^;'
>> [global]
>>        pid file = /var/run/ceph/$name.pid
>>        debug ms = 10
> I wouldn't put this in [global] or you will clutter up output from things
> like 'ceph -s'.
>> [mon]
>        debug ms = 1    ; is usually enough msgr output
>>        mon data = /data/mon$id
>> [mon0]
>>        host = debian-vm1
>>        mon addr = 127.0.0.1:6789
>> [mds]
>        debug ms = 1    ; is usually enough msgr output
>> [mds0]
>>        host = debian-vm1
>> [osd]
>        debug ms = 1    ; is usually enough msgr output
>>        sudo = true
>>        osd data = /data/osd$id
>>        osd journal = /data/osd$id/journal
>>        osd journal size = 128
>>        filestore journal writeahead = true
>> [osd0]
>>        host = debian-vm1
>
> sage
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux