Re: [SPAM] Re: Architectural question

Moreno Andreo <moreno.andreo@xxxxxxxxxx> · Wed, 23 Mar 2016 10:14:06 +0100

Il 11/03/2016 17:37, Jim Nasby ha scritto:
On 2/22/16 8:40 AM, Moreno Andreo wrote:
Il 18/02/2016 21:33, Jim Nasby ha scritto:
Depending on your needs, could could use synchronous replication as
part of that setup. You can even do that at a per-transaction level,
so maybe you use sync rep most of the time, and just turn it off when
inserting or updating BLOBS.
This sounds good, and when everything is OK we have I/O operation split
across the two servers; a small delay in synchronizing blobs should not
be a big deal, even if something bad happens (because of XLOG), right?

It all depends on what you can tolerate. You also don't have to use 
synchronous replication; normal streaming replication is async, so if 
you can stand to lose some data if one of the servers dies then you 
can do that.
I can't tolerate data loss, so synchronous replication is mandatory (I 
had a case this week of a customer asking for an old document that I 
couldn't find in the database, either if the "attach present" flag was 
true... and I had a bit of a hard time trying to convince the customer 
it was his fault... :-) )

Last thing: should blobs (or the whole database directory itself) 
go in
a different partition, to optimize performance, or in VM environment
this is not a concern anymore?

First: IMO concerns about blobs in the database are almost always
overblown.
In many places I've been they say, at last, "BLOBs are slow". So I
considered this as another point to analyze while designing server
architecture. If you say "don't mind", then I won't.

It all depends. They're certainly a lot slower than handling a single 
int, but in many cases the difference just doesn't matter.
The main goal is to be *quick*. A doctor with a patient on the other 
side of his desk does not want to wait, say, 30 seconds for a clinical 
record to open.
Let me explain what is the main problem (actually there are 2 problems).
1. I'm handling health data, and sometines they store large images (say 
an hi-res image of an x-ray). When their team mates (spread all over the 
city, not in the same building) ask for that bitmap (that is, 20 
megabytes), surely it can't be cached (images are loaded only if 
requested by user) and searching a 35k rows, 22 GB table for the 
matching image should not be that fast, even with proper indexing 
(patient record number)
2. When I load patient list, their photo must be loaded as well, because 
when I click on the table row, a small preview is shown (including a 
small thumbnail of the patient's photo). Obviously I can't load all 
thumbs while loading the whole patient list (the list can be up to 
4-5000 records and photo size is about 4-500kBytes, so it would be an 
enormous piece of data to be downloaded.

30GB of blobs on modern hardware really isn't a big deal, and there's
a *lot* to be said for not having to write the extra code to manage
all that by hand.
What do you mean? Extra code?

If the blob is in the database then you have nothing extra to do. It's 
handled just like all your other data.

If it's a file in a file system then you need to:

- Have application code that knows how and where to get at the file
- Have a way to make those files available on all your webservers
- Have completely separate backup and recovery plans for those files

That's a lot of extra work. Sometimes it's necessary, but many times 
it's not.
In my case I think it's not necessary, since all blobs go into a bytea 
field in a table that's just for them. It's an approach that helps us 
keeping up with privacy, since all blobs are encrypted, and can be 
accessed only by application.

When it comes to your disk layout, the first things I'd look at 
would be:

- Move the temporary statistics directory to a RAM disk
- Move pg_xlog to it's own partition
So I need another vDisk, not that big, for pg_xlog?

Yeah, but note that with virtualization that may or may not help.
I was afraid of that. With virtualization we are bound to that hardware 
lying behind us, and that we can't see nor control. Even if we create 2 
vDisk, they should be bound to the same host spindles, and so having two 
vDisk is completely useless.
I'm thinking of increase checkpoint_segments interval, so
In the next two week I should have the VM deployed, so I'll see what 
I'll have in terms of speed and response (looking at the amount we are 
paying, I hope it will be a very FAST machine... :-D)

Thanks
Moreno.-

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance