Re: Using RBD to pack billions of small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If it were me, I would do something along the lines of:

- Bundle larger blocks of code into pixz
<https://github.com/vasi/pixz> (essentially
indexed tar files, allowing random access) and store them in RadosGW.
- Build a small frontend that fetches (with caching) them and provides the
file contents via whatever your UI is.

On Wed, Feb 3, 2021 at 12:55 AM Burkhard Linke <
Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

> Hi,
>
> On 2/3/21 9:41 AM, Loïc Dachary wrote:
> >> Just my 2 cents:
> >>
> >> You could use the first byte of the SHA sum to identify the image, e.g.
> using a fixed number of 256 images. Or some flexible approach similar to
> the way filestore used to store rados objects.
> > A friend suggested the same to save space. Good idea.
>
>
> If you want to further reduce the index size, you can just store the
> offset, and the first 4? 8? bytes at that offset define the size of the
> following artifacts. That's similar to the way Pascal used to store
> strings in the good ol' times. You might also want to think about using
> a complete header which also includes the artifact's name etc. This will
> allow you to rebuild the index if it becomes corrupted. The storage
> overhead should be insignificant
>
> Your index will become a simple mapping of SHA sum -> offset, and you
> might also be able to use optimized implementations.
>
>
> Regards,
>
> Burkhard
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 


This e-mail and all information in, attached to, or linked via this 
e-mail (together the ‘e-mail’) is confidential and may be legally 
privileged. It is intended solely for the intended addressee(s). Access to, 
or any onward transmission, of this e-mail by any other person is not 
authorised. If you are not the intended recipient, you are requested to 
immediately alert the sender of this e-mail and to immediately delete this 
e-mail. Any disclosure in any form of all or part of this e-mail, or of any 
the parties to it, including any copying, distribution or any action taken 
or omitted to be taken in reliance on it, is prohibited and may be 
unlawful. 




This e-mail is not, and is not intended to be, and should 
not be construed as being, (a) any offer, solicitation, or promotion of any 
kind; (b) the basis of any investment or other decision(s);  (c) any 
recommendation to buy, sell or transact in any manner any good(s), 
product(s) or service(s), nor engage in any investment(s) or other 
transaction(s) or activities;  or (d) the provision of, or related to, any 
advisory service(s) or activities, including regarding any investment, tax, 
legal, financial, accounting, consulting or any other related service(s).
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux