Search Postgresql Archives

Re: [Fwd: [PORTS] M$ SQL server DTS package equivalent in

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've also used Pentaho Data Integration (previously known as Kettle)
quite extensively, and can recommend it.  It supports many different
databases and has fairly good documentation (although thin in some
areas).  It has a GUI drag-and-drop tool that can be used to configure
transformations and is very flexible.  It also has an active community
that responds when you have issues.

I use it as part of a regular job that runs every 5 minutes and hourly
to copy and transform data from a SQL Server DB to a PostgreSQL DB.  I
use COPY when I can simply select data into a CSV and load it into
another DB - but as Tomi said, when you have to do primary key
generation, row merging, data cleanup, and data transformations - I
would use some sort of ETL tool over just SQL.

My 2 cents,
Jeremy Haile


On Fri, 26 Jan 2007 15:14:22 +0000, "Tomi N/A" <hefest@xxxxxxxxx> said:
> > Besides being easy to schedule and very flexible, manipulating data
> > with queries is extremely powerful and fairly easy to maintain
> > assuming you know a little SQL -- thanks to postgresql's huge array of
> > built in string manipulation functions.  Your skills learned here will
> > pay off using the database as well for other things.
> >
> > Not only that, but this approach will be fast since it is declarative
> > and handles entire tables at once as opposed to DTS-ish solutions
> > which tend to do processing record by record.  Not to mention they are
> > overcomplicated and tend to suck. (DTS does have the ability to read
> > from any ODBC source which is nice...but that does not apply here).
> 
> Different strokes for different folks, it seems.
> I'd argue that COPY followed by a barrage of plpgsql statements can't
> be used for anything but the most trivial data migration cases (where
> it's invaluable) where you have line-organized data input for a
> hand-full of tables at most.
> In my experience (which is probably very different from anyone
> else's), most real world situations include data from a number of very
> different sources, ranging from the simplest (.csv and, arguably,
> .xml) to the relatively complex (a couple of proprietary databases,
> lots of tables, on-the fly row merging, splitting or generating
> primary keys, date format problems and general pseudo-structured,
> messed up information).
> Once you've got your data in your target database (say, pgsql), using
> SQL to manipulate the data makes sense, but it is only the _final_
> step of an average, real world data transformation.
> 
> Cheers,
> t.n.a.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux