Re: radosgw performance after 0.37

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Nov 20, 2011 at 9:49 AM, Leander Yu <leander.yu@xxxxxxxxx> wrote:
> Hi all,
> I found that after 0.37 the radosgw have some fundamental changes
> which put all object in to .rgw.buckets pool.
> From the release note it seem for better scalability however I wonder
> how this changes will improve the scalability? Base on our test, a
> simple list bucket command by s3cmd will take more than 10 sec when
> the file number is bigger than 10k. is this normal? or it's a
> potential bug?

The scaling issue that was solved was the ability to increase the
number of buckets, whereas you're hitting a different issue now that
relates to the number of objects per bucket. The problem is with the
inefficient implementation of the rados tmap (trivial map) that
requires that every read/write from the directory index requires
reading the entire object, which is not too scalable. We are going to
replace tmap with a not-so-trivial-map that would scale much better
(feature #1571 in the ceph tracker, currently planned for 0.39).

I verified that this is in fact the issue. The problem with listing
object using s3cmd is that it requests the data in chunks of 1000,
which means that going through 10k objects requires that the entire
directory is being read of disk (on the osd side) 10 times.

>
> I haven't fully understand the radosgw code but it seems when you list
> a bucket, it has to list all object in .rgw.buckets and filter out
> those object not belong to the bucket id? if my understanding is
> correct then it make sense to have 10 sec for listing a bucket from
> s3cmd since "rados -p .rgw.bucket ls" took about 7 sec in my case.

I think you misread it. The old implementation did have to list the
entire pool and then filtered out the result, but there was a 1:1
mapping between pools and buckets. The new implementation issues a rgw
class operation that reads the directory index.


Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux