Re: [SPAM] Re: Architectural question

Moreno Andreo <moreno.andreo@xxxxxxxxxx> · Thu, 24 Mar 2016 11:56:52 +0100

Il 23/03/2016 19:51, Jim Nasby ha scritto:
On 3/23/16 4:14 AM, Moreno Andreo wrote:
The main goal is to be *quick*. A doctor with a patient on the other
side of his desk does not want to wait, say, 30 seconds for a clinical
record to open.
Let me explain what is the main problem (actually there are 2 problems).
1. I'm handling health data, and sometines they store large images (say
an hi-res image of an x-ray). When their team mates (spread all over the
city, not in the same building) ask for that bitmap (that is, 20
megabytes), surely it can't be cached (images are loaded only if
requested by user) and searching a 35k rows, 22 GB table for the
matching image should not be that fast, even with proper indexing
(patient record number)

Why wouldn't that be fast? Unless the TOAST table for that particular 
table is pretty fragmented, 
I'm running on Debian with ext4 file system. I'm not expecting 
fragmentation. Am I wrong?
pulling up thumbnails should be very fast. I'd expect it to be the 
cost of reading a few pages sequentially.
I'm not extracting thumbnails. I have a layout that is similar to an 
email client, with all rows with data and, in a column, a clip, that 
lets user to load the real image, not its thumbnail.

If you're mixing all your blobs together, then you might end up with a 
problem. It might be worth partitioning the blob table based on the 
size of what you're storing.
OK, I went to documentation and read about partitioning :-) I knew about 
inheritance, but I was totally unaware of partitioning. Today it's a 
good day, because I've learned something new.
You're saying that it would be better creating, for example, a table for 
blobs < 1 MB, another for blobs between 1 and 5 MB and another for blobs 
> 5 MB? And what about the master table? Should it be one of these three?
Blobs data and size are unpredictable (from 2k RTF to 20 MB JPG),

2. When I load patient list, their photo must be loaded as well, because
when I click on the table row, a small preview is shown (including a
small thumbnail of the patient's photo). Obviously I can't load all
thumbs while loading the whole patient list (the list can be up to
4-5000 records and photo size is about 4-500kBytes, so it would be an
enormous piece of data to be downloaded.

I would think a thumbnail would be 30-40k or less, not 500k. 
You have a point. We adviced of that the users, but they don't care, or 
simply don't know what they are doing. We need to change the application 
to accept max 50k files.
It sounds like part of the problem is you should keep the thumbnails 
separate from the high-res file. But really you should probably do 
that for everything... I suspect there's parts of the UI when you want 
to display a fairly low-res version of something like an xray, only 
pulling the raw image if someone actually needs it.
That's what we are doing. thumbnails are only patient portraits, while 
no other blob (clinical scans) is read until someone asks for it

Thanks
Moreno.

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance