thanks Anthony and Janne....exactly what I have been looking for! On Fri, Feb 25, 2022 at 9:25 AM Janne Johansson <icepic.dz@xxxxxxxxx> wrote: > Den fre 25 feb. 2022 kl 08:49 skrev Anthony D'Atri < > anthony.datri@xxxxxxxxx>: > > There was a similar discussion last year around Software Heritage’s > archive project, suggest digging up that thread. > > Some ideas: > > > > * Pack them into (optionally compressed) tarballs - from a quick search > it sorta looks like HAR uses a similar model. Store the tarballs as RGW > objects, or as RBD volumes, or on CephFS. > > After doing several different kinds of storage solutions in my career, > this above advice is REALLY important. Many hard to solve problems > have started out with "it is just one million files/objects" and when > you reach 50 and sound the alarm, people try to throw money at the > problem instead, and then you reach 2-3-400M and then you can't ask > for the index in finite time without it being invalid by the time the > list is complete. > > If you have a possibility to stick 10,100,1000 small items into a > .tar, into a .zip, into whatever, DO IT. Do it before the numbers grow > too large to handle. When the numbers grow too big, you seldom get the > chance to both keep running in the too-large setup AND re-pack them at > the same time. > > -- > May the most significant bit of your life be positive. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx