On Wed, 2008-02-06 at 12:27 +0100, Dimitri Fontaine wrote: > Multi-Threading behavior and CE support > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > Now, pgloader will be able to run N threads, each one loading some > data to a > partitionned child-table target. N will certainly be configured > depending on > the number of server cores and not depending on the partition > numbers... > > So what do we do when reading a tuple we want to store in a partition > which > has no dedicated Thread started yet, and we already have N Threads > running? > I'm thinking about some LRU(Thread) to choose a Thread to terminate > (launch > COPY with current buffer and quit) and start a new one for the > current > partition target. > Hopefully there won't be such high values of N that the LRU is a bad > choice > per see, and the input data won't be so messy to have to stop/start > Threads > at each new line. For me, it would be good to see a --parallel=n parameter that would allow pg_loader to distribute rows in "round-robin" manner to "n" different concurrent COPY statements. i.e. a non-routing version. Making that work well, whilst continuing to do error-handling seems like a challenge, but a very useful goal. Adding intelligence to the row distribution may be technically hard but may also simply move the bottleneck onto pg_loader. We may need multiple threads in pg_loader, or we may just need multiple sessions from pg_loader. Experience from doing the non-routing parallel version may help in deciding whether to go for the routing version. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq