Search Postgresql Archives

Re: Inserts create new records in reporting table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



200K total records, or 200K records per day/month/quarter/etc?  Because 200K records isn't that much on modern hardware.  Heck, it wasn't much 20 years ago on "Enterprise" hardware.

And you don't mention the window for loading the data into "the" table, and then into the reporting table. 

A trigger will slow down each insert into "the" table, since each operation will do 2x (or more) work.  Only you know whether it's more important to get the data into "the" table ASAP and then then load into the reporting table at your leisure, or get it into both tables ASAP.

No matter what you do, create as few indices as possible before the load phase.  Add the rest afterwards.

Lastly, remember "locality of data".  That makes caches much more effective.  A single query on "the" table when it's clustered on an index tuned to support the reporting table will run Really Fast, whereas a query (or trigger) which bounces all around the table in each fetch and insert won't be nearly as fast.

Really lastly: test each method.

On 12/31/22 01:02, Bret Stern wrote:
Love the forum,

I'm bringing 100k - 200k of AR transactions into a table from several separate ERP companies.

As I insert the records, does it make sense to run a trigger event to build or insert data into

a reporting table (to save generating a query to build the reporting table later.)


For example, I bring in 120k records which contain invoice dates from the beginning of time

in the ERP system. (eg; records with dated transactions from 2010...2022)


Conceptual data

[Location][InvoiceDate][Amount]

Dallas        2012-01-09    7500.00
Dallas        2012-11-19    1200.00
Dallas        2012-04-02    500.00
Phoenix       2012-01-03    1200.00
Phoenix        2012-04-12    7500.00

Would it be possible, or would it make sense to create a trigger which populated another

table like below after each insert...or is is a resource killer..or just unnecessary, and smarter to create a query

that performs calculated on the fly columns for the dates of invoice?


[Location][InvoiceDate][Amount]    [Jan]            [Feb]     [Mar]            [Apr]          [May]             [Jun]      [Jul]         [Aug]           [Sep]            [Oct]     [Nov]         [Dec]

Dallas         2012-01-09                   7500.00

Dallas 2012-11-19 1200.00

Dallas 2012-04-02 500.00

Phoenix    2012-01-03                    1200.00

Phoenix 2012-04-12 7500.00


Bret




--
Born in Arizona, moved to Babylonia.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux