On Mon, Sep 10, 2012 at 9:34 AM, Craig Gibson <craiggib@xxxxxxxxx> wrote: > Hi all > > I am no database wizard so I am hoping someone may be able to assist me :) > > I get a daily CSV file of 6.5 million records. I create a temporary > table and COPY them in. On completion I create an index on the mdnid > column. This column is also indexed in table 2. This part is very > fast. I had some 'checkpoint too often' issues, but that I have > resolved. > > I then use the following procedure to update all the records, and if a > record does not exist, insert it instead: > ... > > From my understanding, a for loop is encapsulated in a cursor anyway > so no need to do that. Am I fundamentally doing something wrong as the > operation is slow as molasses? How big is the parent table? Are you CPU limited or IO limited? If you are not CPU limited, then I would guess that the indexes on the parent table do not fit in RAM or shared_buffers and that maintaining the indexes on the parent table during the updates/inserts is the bottleneck. > Maybe there is a better way altogether > that I have not thought of? The bottom line is that at no point can > the e_lookup table be unavailable to clients, It always has to be available for updates, or just for selects? > else I would have just > done a simple drop and rename post the COPY. Maybe you can just do an atomic rename. Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general