Re: Ceph, make (22nd-sept unstable) fails, and slow write issues.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage:

> The dbench workload is metadata intensive, and most metadata (update)
> operations require a round trip to the MDS.  The client/mds protocol is
> pretty aggressive about issueing leases to the client, but that only goes
> so far.  Metadata performance will never be as fast as a local fs for a
> single process.

I see, what about the metadata workload (intensive?) for a typical
block-based benchmarks?
e.g., using iometer, iozone. for random read/write, sequential read/write..?
When I test the whole system's performance, I'd need to mount ceph,
create a file size of (2x1TB disks spread on two OSDs) -- 2TB!. Even
creating this blank file takes more than several hours.. I think this
is very hard to scale to a higher TB benchmarks, e.g., 10TB.
Is there any better/simpler approach to test the performance? possibly
just directly accessing /dev/sdX over the network?
I could simply create a small chunk of file (say 1GB) for testing, but
depending on where the chunk was created in the disk (esp. for hdd),
it could change the result.

Also, for mounting/creating, how do I check/enable -o big_writes and
-o direct_io ?


> Hmmm -- it looks like the scripts are split on whether you have
> tcmalloc installed in your system, and I don't see any obvious issues
> when I go over them. Do you have tcmalloc installed? Did you try a
> "make clean; make" cycle?

Sadly, I still get the error in some of other PCs.
cosd.cc: In function ‘int main(int, const char**)’:
cosd.cc:65: error: ‘IsHeapProfilerRunning’ was not declared in this scope
cosd.cc:310: warning: ignoring return value of ‘int chdir(const
char*)’, declared with attribute warn_unused_result
make[2]: *** [cosd-cosd.o] Error 1

I've tried make-clean, and re-git everything again. only reverting
back to 'stable' version compiles. Just 'unstable' doesn't work.
Anyway to fix this, without re-installing the OS ? ;-)
Another silly question, since I'm using dpkg to build to .deb and
install, is there a faster way to make and compile? It seems that
everytime change occur, I'd need to dpkg (which takes several minutes)
and re-install again.?
I've tried to just use configure,make but it won't include nifty
stuffs for deb, like automatic daemon /etc/init.d/ceph

Thanks a lot.


On Fri, Sep 24, 2010 at 5:20 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Fri, 24 Sep 2010, DongJin Lee wrote:
>> > Wow, these levels seem oddly slow. What filesystem are those disks running?
>> > What tool are you using to test IOPS, and what's your network setup?
>> > If you can move the journal onto a separate device it will help.
>> > -Greg
>> >
>>
>> I have changed the OSD to SSD (OCZ Vertex2)
>> To simplify the problem, only one OSD is used,
>> so now,  3 PCs  - 1OSD, 1MDS, 1MON (all similar spec, except the OSD's
>> 1TB disk replaced with 60GB SSD), using btrfs
>> all connected to 1G/s managed switch. Iperf gives about 980Mb/s
>>
>> I ran dbench, on the OSD PC.
>> Local vs /ceph
>>
>> using dbench -t 10 100 -D /data/osd0/
>> -  gives 150MB/s, good and as expected.
>>
>> using dbench -t 10 100 -D /ceph
>> -  gives 15MB/s
>
> The dbench workload is metadata intensive, and most metadata (update)
> operations require a round trip to the MDS.  The client/mds protocol is
> pretty aggressive about issueing leases to the client, but that only goes
> so far.  Metadata performance will never be as fast as a local fs for a
> single process.
>
>> Also, I often get errors from dbench while benchmarking (only when
>> /ceph mounts, not elsewhere)
>>
>> [323] unlink '/media/cephlocal/clients/client25/~dmtmp/WORD/CHAP10.DOC'
>> failed - No such file or directory
>>   52  cleanup  14 sec
>>   39  cleanup  15 sec
>> [323] unlink '/media/cephlocal/clients/client24/~dmtmp/WORDPRO/NEWS1_1.LWP'
>> failed - No such file or directory
>> [323] unlink '/media/cephlocal/clients/client91/~dmtmp/WORD/CHAP10.DOC'
>> failed - No such file or directory
>> [323] unlink '/media/cephlocal/clients/client27/~dmtmp/WORDPRO/RESULTS.XLS'
>> failed - No such file or directory
>>
>> and
>>
>> [31] open /media/cephlocal/clients/client258/filler.001 failed for
>> handle 9939 (No such file or directory)
>> [31] open /media/cephlocal/clients/client220/filler.001 failed for
>> handle 9939 (No such file or directory)
>> [111] open /media/cephlocal/clients/client218/~dmtmp/WORD failed for
>> handle 9943 (No such file or directory)
>> [91] open /media/cephlocal/clients/client986/filler.004 failed for
>> handle 9942 (No such file or directory)
>
> What version of the client are you running?  I was hitting this (or a
> similar) problem with the dbench workload a couple weeks ago and it turned
> out to be a long standing bug in fs/namei.c that has finally (finally!)
> been fixed in 2.6.36-rc2 (2e2e88ea).  I hadn't seen it actually come up
> for over a year, so something must have changed with the kclient or MDS
> recently that made the bug happen, but in any case it's fixed now.
>
> sage
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux