Re: question about rgw delete speed

"Brent Kennedy" <bkennedy@xxxxxxxxxx> · Thu, 12 Nov 2020 18:07:34 -0500

Ceph is definitely a good choice for storing millions of files.  It sounds like you plan to use this like s3, so my first question would be:  Are the deletes done for a specific reason?  ( e.g. the files are used for a process and discarded  )  If its an age thing, you can set the files to expire when putting them in, then ceph will automatically clear them.

The more spinners you have the more performance you will end up with.  Network 10Gb or higher?

Octopus is production stable and contains many performance enhancements.  Depending on the OS, you may not be able to upgrade from nautilus until they work out that process ( e.g. centos 7/8 ).  

Delete speed is not that great but you would have to test it with your cluster to see how it performs for your use case.  If you have enough space present, is there a process that breaks if the files are not deleted?  

Regards,
-Brent

Existing Clusters:
Test: Ocotpus 15.2.5 ( all virtual on nvme )
US Production(HDD): Nautilus 14.2.11 with 11 osd servers, 3 mons, 4 gateways, 2 iscsi gateways
UK Production(HDD): Nautilus 14.2.11 with 18 osd servers, 3 mons, 4 gateways, 2 iscsi gateways
US Production(SSD): Nautilus 14.2.11 with 6 osd servers, 3 mons, 4 gateways, 2 iscsi gateways
UK Production(SSD): Octopus 15.2.5 with 5 osd servers, 3 mons, 4 gateways

-----Original Message-----
From: Adrian Nicolae <adrian.nicolae@xxxxxxxxxx> 
Sent: Wednesday, November 11, 2020 3:42 PM
To: ceph-users <ceph-users@xxxxxxx>
Subject:  question about rgw delete speed

Hey guys,

I'm in charge of a local cloud-storage service. Our primary object storage is a vendor-based one and I want to replace it in the near future with Ceph with the following setup :

- 6 OSD servers with 36 SATA 16TB drives each and 3 big NVME per server
(1 big NVME for every 12 drives so I can reserve 300GB NVME storage for every SATA drive), 3 MON, 2 RGW with Epyc 7402p and 128GB RAM. So in the end we'll have ~ 3PB of raw data and 216 SATA drives.

Currently we have ~ 100 millions of files on the primary storage with the following distribution :

- ~10% = very small files ( less than 1MB - thumbnails, text&office files and so on)

- ~60%= small files (between 1MB and 10MB)

-  20% = medium files ( between 10MB and 1GB)

- 10% = big files (over 1GB).

My main concern is the speed of delete operations. We have around 500k-600k delete ops every 24 hours so quite a lot. Our current storage is not deleting all the files fast enough (it's always 1 week-10 days
behind) , I guess is not only a software issue and probably the delete speed will get better if we add more drives (we now have 108).

What do you think about Ceph delete speed ? I read on other threads that it's not very fast . I wonder if this hw setup can handle our current delete load better than our current storage. On RGW servers I want to use Swift , not S3.

And another question :   can I start deploying in production directly the latest Ceph version (Octopus) or is it safer to start with Nautilus until Octopus will be more stable ?

Any input would be greatly appreciated !

Thanks,

Adrian.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx