Re: Pacific: access via S3 / Object gateway slow for small files

E Taka <0etaka0@xxxxxxxxx> · Fri, 27 Aug 2021 15:12:17 +0200

Hi,

thanks for the answers. My goal was to speed up the S3 interface, an
not only  a single program. This was successful with this method:
https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#block-and-block-db

However, one major disadvantage was that Cephadm considered the OSDs
as "STRAY DAEMON"  and the OSD could not be adminstered with the
Dashboard. What really helped was this doc:

https://docs.ceph.com/en/pacific/cephadm/osd/

1. As prerequisites one have to turn off the automagic creation of OSD:

  ceph orch apply osd --all-available-devices --unmanaged=true

2. Then create a YAML specificastion like this and apply it:

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  rotational: 1
db_devices:
  rotational: 0

3. delete ALL OSD from one node:
  ceph orch osd rm <osd_id(s)>
(and wait probably for many< hours)

4. Zap those HDD and SSD:
ceph orch device zap <hostname> <path>

5. Activate ceph-volume via
  ceph cephadm osd activate <host>

Et voià! Now we can use the dashboard and the SSD are used für WAL/DB.
This speedens up the access to Ceph, epsecially the S3 API which is
almost 10 times as fast as before.

For Pacific++ there should be a very prominent reference to the doc
"Cephadm – OSD service", in particular from the "BlueStore Settings"
(first URL above). That would have saved me many hours of testing.

Thanks anyway!

Am Di., 24. Aug. 2021 um 10:41 Uhr schrieb Janne Johansson
<icepic.dz@xxxxxxxxx>:
>
> Den tis 24 aug. 2021 kl 09:46 skrev Francesco Piraneo G. <fpiraneo@xxxxxxxxxxx>:
> > Il 24.08.21 09:32, Janne Johansson ha scritto:
> > >> As a simple test I copied an Ubuntu /usr/share/doc (580 MB in 23'000 files):
> > >> - rsync -a to a Cephfs took 2 min
> > >> - s3cmd put --recursive took over 70 min
> > >> Users reported that the S3 access is generally slow, not only with s3tools.
> > > Single per-object accesses and writes on S3 are slower, since they
> > > involve both client and server side checksumming, a lot of http(s)
> > > stuff before the actual operations start and I don't think there is a
> > > lot of connection reuse or pipelining being done so you are going to
> > > make some 23k requests, each taking a non-zero time to complete.
> > >
> > Question: Is Swift compatible protocol faster?
>
> Probably not, but make a few tests and find out how it works at your place.
> It's kind of easy to rig both at the same time, so you can test on exactly the
> same setup.
>
> > Use case: I have to store indefinite files quantity for a data storage
> > service; I thought object storage is the unique solution; each file is
> > identified by UUID, no metadata on file, files are chunked 4Mb size each.
>
> That sounds like a better case for S3/Swift.
>
> > In such case cephfs is the best suitable choice?
>
> One factor to add might be "will it be reachable from the outside?",
> since radosgw is kind of easy to put behind a set of load balancers,
> that can wash/clean incoming traffic and handle TLS offload and things
> like that. Putting cephfs out on the internet might have other cons.
>
> --
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx