I've also used Pentaho Data Integration (previously known as Kettle) quite extensively, and can recommend it. It supports many different databases and has fairly good documentation (although thin in some areas). It has a GUI drag-and-drop tool that can be used to configure transformations and is very flexible. It also has an active community that responds when you have issues. I use it as part of a regular job that runs every 5 minutes and hourly to copy and transform data from a SQL Server DB to a PostgreSQL DB. I use COPY when I can simply select data into a CSV and load it into another DB - but as Tomi said, when you have to do primary key generation, row merging, data cleanup, and data transformations - I would use some sort of ETL tool over just SQL. My 2 cents, Jeremy Haile On Fri, 26 Jan 2007 15:14:22 +0000, "Tomi N/A" <hefest@xxxxxxxxx> said: > > Besides being easy to schedule and very flexible, manipulating data > > with queries is extremely powerful and fairly easy to maintain > > assuming you know a little SQL -- thanks to postgresql's huge array of > > built in string manipulation functions. Your skills learned here will > > pay off using the database as well for other things. > > > > Not only that, but this approach will be fast since it is declarative > > and handles entire tables at once as opposed to DTS-ish solutions > > which tend to do processing record by record. Not to mention they are > > overcomplicated and tend to suck. (DTS does have the ability to read > > from any ODBC source which is nice...but that does not apply here). > > Different strokes for different folks, it seems. > I'd argue that COPY followed by a barrage of plpgsql statements can't > be used for anything but the most trivial data migration cases (where > it's invaluable) where you have line-organized data input for a > hand-full of tables at most. > In my experience (which is probably very different from anyone > else's), most real world situations include data from a number of very > different sources, ranging from the simplest (.csv and, arguably, > .xml) to the relatively complex (a couple of proprietary databases, > lots of tables, on-the fly row merging, splitting or generating > primary keys, date format problems and general pseudo-structured, > messed up information). > Once you've got your data in your target database (say, pgsql), using > SQL to manipulate the data makes sense, but it is only the _final_ > step of an average, real world data transformation. > > Cheers, > t.n.a. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/