Search Postgresql Archives

Re: Database versus filesystem for storing images

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>> Don't store your images in the database.  Store them on the filesystem and
>> store their path in the database
I 100% agree. Use the database as a lookup into the filesystem. Don't load the database up with terabytes of non-searchable binary data? not sure how that would help you?

Here is one idea:

Have a 64-bit sequence that you use to generate an image_id (becomes file name). Hash that id out over a 3-level deep directory structure that allows 4096 entries per directory. Should give you 64-bit worth of files.

CREATE OR REPLACE FUNCTION get_image_path(image_id BIGINT)
RETURNS TEXT AS $$
  DECLARE
    -- 40963, avoids "integer out of range"
    dir1_val BIGINT := 68719476736;
    dir1 BIGINT;
    dir2 BIGINT;
    dir3 BIGINT;
    path TEXT;

  BEGIN
    dir1 := (image_id / dir1_val) % 4096;
    dir2 := (image_id / (4096 * 4096)) % 4096;
    dir3 := (image_id / 4096) % 4096;
    RETURN '/BASE_PATH/' || dir1 || '/' || dir2 ||
           '/' || dir3 || '/' || image_id;
  END;
$$ LANGUAGE PLPGSQL;

test=# select get_image_path(200399322222);
           get_image_path
-------------------------------------
 /BASE_PATH/2/3752/2991/200399322222
(1 row)

>I mean, how do you handle integrity with data
> outside the database?
You don't, the file system handles integrity of the stored data. Although, one must careful to avoid db and fs orphans. Meaning, a record with no corresponding file or a file with no corresponging record. Always write()/insert an image file to the system within a transaction, including writing the image out to the fs. Make sure to unlink any paritally written image files.

>>How do you plan your backup routine
In regards to backup, backup the files one-by-one. Grab the lastest image file refs from the database and start backing up those images. Each successfully backed up image should be followed by inserting that file's database record into a remote db server. If anything fails, cleanup the partial image file (to avoid orphaned data) and rollout the transaction.

just one idea. i'm sure there are other ways of doing it. point is, this is completely possible to do reliably.

andrew



Jorge Godoy wrote:
John McCawley <nospam@xxxxxxxxxxxx> writes:

Don't store your images in the database.  Store them on the filesystem and
store their path in the database.  Anyone that tells you otherwise is a stark
raving madman :)

My system is very heavily used, and our pg_dump is only a few gigs.  Meanwhile
our images/documents storage is well over a hundred gigs.  I'd hate to think
that I'd have to dump and restore 100 gigs every time I wanted to dump the
newest data to the development database.

How do you plan your backup routine and how do you guarantee that on a failure
all needed data is restored?  I mean, how do you handle integrity with data
outside the database?



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux