Re: Some problem of ceph and radosgw

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Please see my reply below.

On Mon, 2010-09-27 at 15:18 +0800, cang lin wrote:
> Hi Wido,
> 
> 2010/9/27 Wido den Hollander <wido@xxxxxxxxx>:
> > Hi,
> >
> > The OSD performance you are seeing is not network related, it's due to
> > your hardware/drivers.
> >
> > Do you have a journal enabled?
> >
> The journal config in cepth.conf is as following
> [osd]
>         sudo = true
>         osd data = /mnt/ceph/osd$id/data
>         keyring = /etc/ceph/keyring.$name
>         osd journal = /mnt/ceph/osd$id/data/journal
>         osd journal size = 100
> 
> 
> As you can see as following,the journal file is full.
> 
> root@ceph01:/mnt/ceph/osd0/data# ls -l
> total 102400
> -rw-r--r-- 1 root root        37 2010-08-26 03:27 ceph_fsid
> drwxr-xr-x 1 root root      9956 2010-09-25 22:47 current
> -rw-r--r-- 1 root root         8 2010-08-26 03:27 fsid
> -rw-r--r-- 1 root root 104857600 2010-09-26 20:10 journal
> -rw-r--r-- 1 root root        21 2010-08-26 03:27 magic
> -rw-r--r-- 1 root root         2 2010-08-26 03:27 whoami
> 
> So I think the journal is enabled,is there anything else needed to do?
> 
> > And could you try to do a benchmark with:
> >
> > hdparm -tT /dev/sdx
> >
> > Where /dev/sdx is the disk where your data is on. This will give you the
> > read performance.
> >
> > You could also do some write tests with dd, for example:
> >
> > dd if=/dev/zero of=1GB.bin bs=1024k count=1024 conv=sync
> >
> > Run this in your OSD data directory.
> >
> 
> I did as your example,the result of benchmark is as following
> 
> root@ceph01:/#df Ch
> â
> /dev/sde1             187G   45G  142G  24% /mnt/ceph/osd0/data
> root@ceph01:/ # hdparm -tT /dev/sde
> /dev/sde:
>  Timing cached reads:   1038 MB in  2.00 seconds = 518.85 MB/sec
>  Timing buffered disk reads:  166 MB in  3.01 seconds =  55.06 MB/sec
> 
> root@ceph01:/# cd /mnt/ceph/osd0/data
> root@ceph01:/mnt/ceph/osd0/data# dd if=/dev/zero of=1GB.bin bs=1024k
> count=1024 conv=sync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 87.08 s, 12.3 MB/s
> 
> root@ceph02:/# df Ch
> â
> /dev/sde1             466G   46G  420G  10% /mnt/ceph/osd1/data
> root@ceph02:/# hdparm -tT /dev/sde
> /dev/sde:
>  Timing cached reads:   1014 MB in  2.00 seconds = 506.76 MB/sec
>  Timing buffered disk reads:  198 MB in  3.02 seconds =  65.54 MB/sec
> 
> root@ceph02:/ # cd /mnt/ceph/osd1/data
> root@lz05:/mnt/ceph/osd1/data# dd if=/dev/zero of=1GB.bin bs=1024k
> count=1024 conv=sync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 78.3063 s, 13.7 MB/s
> 
> From the test above,through the hardware/drivers is not very fast,but
> still much times fast than osd.so how to improve the speed of osd?

Since the journal is on the same disk as the OSD data, the write speed
you are seeing is slow.

I should fix that first, that is the explanation of your slow write OSD
speed.

> 
> > About the RADOS Gateway, which libs3 are you using? I would recommend:
> > git://github.com/wido/libs3.git
> >
> Yes, I used the libs3 which supports alternative hostnames as
> http://ceph.newdream.net/wiki/RADOS_Gateway # Testing recommand,
> 
> > Yesterday I found a bug in the RADOS Gateway, it might be that you are
> > also hitting it. Could you upload your error_log (Apache) somewhere or
> > attach it? And please set RGW_LOG_LEVEL to 20 to get as much information
> > as possible.

Add the following to your vhost:

SetEnv RGW_LOG_LEVEL 20

Please make sure mod_env is installed and loaded.

> >
> 
> On the ceph01:
> web1@ceph01:~$ s3 -u create alpha
> ERROR: ConnectionFailed
> 
> I find the following errors in apacheâs error log:
> 10.09.27_11:44:10.678530 b784b6d0 auth: can't open key file(s)
> /etc/ceph/keyring.bin
> 10.09.27_11:44:10.681523 b784b6d0 librados: client.admin
> authentication error Operation not permitted
> Couldn't init storage provider (RADOS)
> 
> Monïmds and ods were running on ceph01,and keyring.bin is in
> /etc/ceph/. The error occurs when I set keyring.bin is read/write only
> by root,so I try to set it can be read/write by anyone and restart
> apache,it turned out that the size of error.log increase fast.
> 
> rw-r----- 1 root adm   1478013 2010-09-27 12:24 error.log
> rw-r----- 1 root adm   1482306 2010-09-27 12:24 error.log
> rw-r----- 1 root adm   1503771 2010-09-27 12:25 error.log
> rw-r----- 1 root adm   1520943 2010-09-27 12:25 error.log
> rw-r----- 1 root adm   1529529 2010-09-27 12:25 error.log
> 
> So I have to stop apache
> 
> root@lz04:/var/log/apache2# /etc/init.d/apache2 stop
> rw-r----- 1 root adm   1598283 2010-09-27 12:27 error.log
> 
> After restart apache the error.log is as following
> â
> [Mon Sep 27 11:45:20 2010] [notice] caught SIGTERM, shutting down
> [Mon Sep 27 11:45:21 2010] [notice] suEXEC mechanism enabled (wrapper:
> /usr/lib/apache2/suexec)
> [Mon Sep 27 11:45:21 2010] [notice] FastCGI: process manager
> initialized (pid 2753)
> [Mon Sep 27 11:45:22 2010] [notice] Apache/2.2.14 (Ubuntu)
> mod_fastcgi/2.4.6 PHP/5.3.2-1ubuntu4.5 with Suhosin-Patch
> mod_scgi/1.13 configured -- resuming normal operations
> [Mon Sep 27 12:13:28 2010] [notice] caught SIGTERM, shutting down
> [Mon Sep 27 12:13:29 2010] [notice] suEXEC mechanism enabled (wrapper:
> /usr/lib/apache2/suexec)
> [Mon Sep 27 12:13:29 2010] [notice] FastCGI: wrapper mechanism enabled
> (wrapper: /var/www/s3gw.fcgi)
> [Mon Sep 27 12:13:29 2010] [notice] FastCGI: process manager
> initialized (pid 4381)
> [Mon Sep 27 12:13:29 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> (uid 33, gid 33) started (pid 4382)
> [Mon Sep 27 12:13:29 2010] [notice] Apache/2.2.14 (Ubuntu)
> mod_fastcgi/2.4.6 PHP/5.3.2-1ubuntu4.5 with Suhosin-Patch
> mod_scgi/1.13 configured -- resuming normal operations
> [Mon Sep 27 12:13:29 2010] [warn] FastCGI: (dynamic) server
> "/var/www/s3gw.fcgi" (uid 1004, gid 1004) started (pid 4389)
> RADOS S3 Gateway: FCGI_ROLE=RESPONDER
> RADOS S3 Gateway: SCRIPT_URL=/RPC2
> RADOS S3 Gateway: SCRIPT_URI=http://localhost/RPC2
> RADOS S3 Gateway: HTTP_AUTHORIZATION=
> RADOS S3 Gateway: RGW_DNS_NAME=s3.
> RADOS S3 Gateway: CONTENT_LENGTH=564
> RADOS S3 Gateway: HTTP_HOST=localhost:80
> RADOS S3 Gateway: HTTP_ACCEPT=*/*
> RADOS S3 Gateway: HTTP_USER_AGENT=Erlang XML-RPC Client 1.13
> RADOS S3 Gateway: CONTENT_TYPE=text/xml
> RADOS S3 Gateway: HTTP_CONNECTION=close
> RADOS S3 Gateway: PATH=/usr/local/bin:/usr/bin:/bin
> RADOS S3 Gateway: SERVER_SIGNATURE=
> RADOS S3 Gateway: SERVER_SOFTWARE=Apache/2.2.14 (Ubuntu)
> RADOS S3 Gateway: SERVER_NAME=localhost
> RADOS S3 Gateway: SERVER_ADDR=127.0.0.1
> RADOS S3 Gateway: SERVER_PORT=80
> RADOS S3 Gateway: REMOTE_ADDR=127.0.0.1
> RADOS S3 Gateway: DOCUMENT_ROOT=/var/www
> RADOS S3 Gateway: SERVER_ADMIN=[no address given]
> RADOS S3 Gateway: SCRIPT_FILENAME=/var/www/s3gw.fcgi
> RADOS S3 Gateway: REMOTE_PORT=53559
> RADOS S3 Gateway: GATEWAY_INTERFACE=CGI/1.1
> RADOS S3 Gateway: SERVER_PROTOCOL=HTTP/1.1
> RADOS S3 Gateway: REQUEST_METHOD=POST
> RADOS S3 Gateway: QUERY_STRING=page=RPC2&params=
> RADOS S3 Gateway: REQUEST_URI=/RPC2
> RADOS S3 Gateway: SCRIPT_NAME=/RPC2
> RADOS S3 Gateway: gateway_dns_name = s3.
> RADOS S3 Gateway: host=localhost:80
> RADOS S3 Gateway: parsed: name=page val=RPC2
> RADOS S3 Gateway: parsed: name=params val=
> RADOS S3 Gateway: in url_decode with RPC2
> RADOS S3 Gateway: s->object=<NULL> s->bucket=RPC2
> 
> RADOS S3 Gateway: FCGI_ROLE=RESPONDER  loop from here
> 
> RADOS S3 Gateway: FCGI_ROLE=RESPONDER
> â
> 
> The error.log is a attachment in this mail.
> 
> The following error occurs on cl0(client):
> 
> web1@cl0:~$ s3 -u create alpha
> ERROR: XmlParseFailure
> 
> Couldn't init storage provider (RADOS) error occurred because I forgot
> to copy keyring.bin.
> After the keyring.bin was copied to /etc/ceph and apache was restarted
> the problem is solved.
> 
> Error.log is as following

I can't find any issues in your error.log file, i'm not sure where this
is going wrong.

> 
> 10.09.26_13:31:31.781936 b729c6d0 auth: can't open key file(s)
> /etc/ceph/keyring.bin
> 10.09.26_13:31:31.971516 b729c6d0 librados: client.admin
> authentication error Operation not permitted
> Couldn't init storage provider (RADOS)
> [Sun Sep 26 13:31:31 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> (pid 23589) terminated by calling exit with status '5'
> [Sun Sep 26 13:31:31 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> has failed to remain running for 30 seconds given 3 attempts, its
> restart interval has been backed off to 600 seconds
> [Sun Sep 26 13:36:31 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> has failed to remain running for 30 seconds given 3 attempts, its
> restart interval has been backed off to 600 seconds
> [Sun Sep 26 13:41:31 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> has failed to remain running for 30 seconds given 3 attempts, its
> restart interval has been backed off to 600 seconds
> [Sun Sep 26 13:41:31 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> (uid 33, gid 33) restarted (pid 24103)
> [Sun Sep 26 20:26:58 2010] [notice] Graceful restart requested, doing restart
> apache2: Could not reliably determine the server's fully qualified
> domain name, using 117.41.229.124 for ServerName
> [Sun Sep 26 20:26:58 2010] [notice] FastCGI: wrapper mechanism enabled
> (wrapper: /var/www/s3gw.fcgi)
> [Sun Sep 26 20:26:58 2010] [notice] FastCGI: process manager
> initialized (pid 27972)
> [Sun Sep 26 20:26:58 2010] [warn] FastCGI: server "/usr/bin/radosgw"
> (uid 33, gid 33) started (pid 27973)
> [Sun Sep 26 20:26:58 2010] [notice] Apache/2.2.14 (Ubuntu)
> mod_fastcgi/2.4.6 PHP/5.3.2-1ubuntu4.5 with Suhosin-Patch
> mod_scgi/1.13 configured -- resuming normal operations
> 
> Please forgive my ignorance,but I donât know which file that
> RGW_LOG_LEVEL set to 20 should be appended. Is it ceph.conf or
> fastcgi.conf or any other file?
> 
> 
> Thanks!
> 
>  Lin

Thanks,

Wido

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux