Re: S3 key prefixes and performance impact on Ceph?

Matt Benjamin <mbenjami@xxxxxxxxxx> · Fri, 22 May 2020 09:45:53 -0400

Hi,

The current behavior is effectively that of a flat namespace.  As the
number of objects in a bucket becomes large, RGW partitions the index,
and a hash of the key name is used to place it.  Reads on the
partitions are done in parallel (unless unordered listing is
requested, an RGW extension).

Matt

On Fri, May 22, 2020 at 8:39 AM <malinsk@xxxxxxxxxxxxx> wrote:
>
> I've just set up a Ceph cluster and I'm accessing it via object gateway with S3 API.
>
> One thing I don't see documented anywhere is - how does Ceph performance scale with S3 key prefixes?
>
> In AWS S3, performance scales linearly with key prefix (see: https://docs.aws.amazon.com/AmazonS3/latest/dev/optimizing-performance.html). I see the keys as a nested hash table or nodes of a prefix tree, where each prefix is stored in closer proximity at a hardware level - you want to spread reads evenly over prefixes to avoid parallel I/O being concentrated on the same hot spots.
>
> So for example if my access pattern regularly involves scanning data through multiple dates for a single city, this key structure will be more effective: `yyyymmdd/city/data.csv`. Whereas if my access pattern involves scanning through different cities on a single date, `city/yyyymmdd/data.csv` would be more effective.
>
> How about Ceph? Does naming convention of the key prefixes have an effect on Ceph's object gateway performance or does it treat the full object "paths" as a completely flat namespace?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx