Dear Loïc , I do not have direct experience with this many files, but it resonates for me with deduplication, such as borg (https://www.borgbackup.org/) or a similar implementation in the latest Proxmox Backup Server ( https://pbs.proxmox.com/wiki/index.php/Main_Page). I think you would need a filesystem for either, so not sure how well this would integrate directly with RBD, but maybe cephfs is an option? I typically run zfs on top of rbd, and use only zfs compression, and then put borg on top of zfs. There is overhead, but this is a very flexible setup, operationally. All the best in your endeavor! -- Alex Gorbachev ISS/Storcium On Sat, Jan 30, 2021 at 10:01 AM Loïc Dachary <loic@xxxxxxxxxxx> wrote: > Bonjour, > > In the context Software Heritage (a noble mission to preserve all source > code)[0], artifacts have an average size of ~3KB and there are billions of > them. They never change and are never deleted. To save space it would make > sense to write them, one after the other, in an every growing RBD volume > (more than 100TB). An index, located somewhere else, would record the > offset and size of the artifacts in the volume. > > I wonder if someone already implemented this idea with success? And if > not... does anyone see a reason why it would be a bad idea? > > Cheers > > [0] https://docs.softwareheritage.org/ > > -- > Loïc Dachary, Artisan Logiciel Libre > > > > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx