Re: Using RBD to pack billions of small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

Using borg would indeed make sense to copy the replicate the rbd content in case
rbd-mirror is not an option, nice idea :-)

Interestingly there is no need for a proper file system: the files are immutable and never
deleted. They are indexed by the SHA256 of their content and a map where the key is
the SHA256 and the value is the offset,size in the rbd image would be enough.

Cheers

On 01/02/2021 03:27, Alex Gorbachev wrote:
> Dear Loïc ,
>
> I do not have direct experience with this many files, but it resonates for
> me with deduplication, such as borg (https://www.borgbackup.org/) or a
> similar implementation in the latest Proxmox Backup Server (
> https://pbs.proxmox.com/wiki/index.php/Main_Page).  I think you would need
> a filesystem for either, so not sure how well this would integrate directly
> with RBD, but maybe cephfs is an option?  I typically run zfs on top of
> rbd, and use only zfs compression, and then put borg on top of zfs.  There
> is overhead, but this is a very flexible setup, operationally.  All the
> best in your endeavor!
> --
> Alex Gorbachev
> ISS/Storcium
>
>
>
> On Sat, Jan 30, 2021 at 10:01 AM Loïc Dachary <loic@xxxxxxxxxxx> wrote:
>
>> Bonjour,
>>
>> In the context Software Heritage (a noble mission to preserve all source
>> code)[0], artifacts have an average size of ~3KB and there are billions of
>> them. They never change and are never deleted. To save space it would make
>> sense to write them, one after the other, in an every growing RBD volume
>> (more than 100TB). An index, located somewhere else, would record the
>> offset and size of the artifacts in the volume.
>>
>> I wonder if someone already implemented this idea with success? And if
>> not... does anyone see a reason why it would be a bad idea?
>>
>> Cheers
>>
>> [0] https://docs.softwareheritage.org/
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

-- 
Loïc Dachary, Artisan Logiciel Libre


Attachment: OpenPGP_signature
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux