Den fre 25 feb. 2022 kl 08:49 skrev Anthony D'Atri <anthony.datri@xxxxxxxxx>: > There was a similar discussion last year around Software Heritage’s archive project, suggest digging up that thread. > Some ideas: > > * Pack them into (optionally compressed) tarballs - from a quick search it sorta looks like HAR uses a similar model. Store the tarballs as RGW objects, or as RBD volumes, or on CephFS. After doing several different kinds of storage solutions in my career, this above advice is REALLY important. Many hard to solve problems have started out with "it is just one million files/objects" and when you reach 50 and sound the alarm, people try to throw money at the problem instead, and then you reach 2-3-400M and then you can't ask for the index in finite time without it being invalid by the time the list is complete. If you have a possibility to stick 10,100,1000 small items into a .tar, into a .zip, into whatever, DO IT. Do it before the numbers grow too large to handle. When the numbers grow too big, you seldom get the chance to both keep running in the too-large setup AND re-pack them at the same time. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx