Enrico Sirola wrote:
typically, arrays contain 1000 elements, and an operation is either multiply it by a scalar or multiply it element-by-element with another array. The time to rescale 1000 arrays, multiply it for another array and at the end sum all the 1000 resulting arrays should be enough to be carried on in an interactive application (let's say 0.5s). This, in the case when no disk-access is required. Disk access will obviously downgrade performances a bit ad the beginning, but the workload is mostly read-only so after a while the whole table will be cached anyway. The table containing the arrays would be truncated/repopulated every day and the number of arrays is expected to be more or less 150000 (at least this is what we have now). Nowadays, we have a c++ middleware between the calculations and an aggressive caching of the table contents (and we don't use arrays, just a row per element) but the application could be refactored (and simplified a lot) if we have a smart way to save data into the DB.
I don't know if the speed will meet your needs, but you might test to see if PL/R will work for you: http://www.joeconway.com/plr/ You could use pg.spi.exec() from within the R procedure to grab the arrays, do all of your processing inside R (which uses whatever BLAS you've set it up to use), and then return the result out to Postgres. Joe ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend