Re: Using RBD to pack billions of small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2/3/21 9:41 AM, Loïc Dachary wrote:
Just my 2 cents:

You could use the first byte of the SHA sum to identify the image, e.g. using a fixed number of 256 images. Or some flexible approach similar to the way filestore used to store rados objects.
A friend suggested the same to save space. Good idea.


If you want to further reduce the index size, you can just store the offset, and the first 4? 8? bytes at that offset define the size of the following artifacts. That's similar to the way Pascal used to store strings in the good ol' times. You might also want to think about using a complete header which also includes the artifact's name etc. This will allow you to rebuild the index if it becomes corrupted. The storage overhead should be insignificant

Your index will become a simple mapping of SHA sum -> offset, and you might also be able to use optimized implementations.


Regards,

Burkhard

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux