Re: Using RBD to pack billions of small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

On 01/02/2021 21:13, Dan van der Ster wrote:
> Hi Loïc,
>
> We've never managed 100TB+ in a single RBD volume. I can't think of
> anything, but perhaps there are some unknown limitations when they get so
> big.
> It should be easy enough to use rbd bench to create and fill a massive test
> image to validate everything works well at that size.
Good idea! I'll look for a cluster with 100TB of free space and post my findings.
>
> Also, I assume you'll be doing the IO from just one client? Multiple
> readers/writers to a single volume could get complicated.
Yes.
>
> Otherwise, yes RBD sounds very convenient for what you need.
It is inspired by https://static.usenix.org/event/osdi10/tech/full_papers/Beaver.pdf which suggests an ad-hoc implementation to pack immutable objects together. But I think RBD already provides the underlying logic, even though it is not specialized for this use case. RGW also packs small objects together and would be a good candidate. But it provides more flexibility to modify/delete objects and I assume it will be slower to write N objects with RGW than to write them sequentially on an RBD image. But I did not try and maybe I should.

To be continued.
>
> Cheers, Dan
>
>
> On Sat, Jan 30, 2021, 4:01 PM Loïc Dachary <loic@xxxxxxxxxxx> wrote:
>
>> Bonjour,
>>
>> In the context Software Heritage (a noble mission to preserve all source
>> code)[0], artifacts have an average size of ~3KB and there are billions of
>> them. They never change and are never deleted. To save space it would make
>> sense to write them, one after the other, in an every growing RBD volume
>> (more than 100TB). An index, located somewhere else, would record the
>> offset and size of the artifacts in the volume.
>>
>> I wonder if someone already implemented this idea with success? And if
>> not... does anyone see a reason why it would be a bad idea?
>>
>> Cheers
>>
>> [0] https://docs.softwareheritage.org/
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

-- 
Loïc Dachary, Artisan Logiciel Libre


Attachment: OpenPGP_signature
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux