Re: Exporting TBs of data in Oracle schema/database into CSV format to migrate into PostgreSQL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, Apr 20, 2020 at 12:12 PM Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:
On Mon, 2020-04-20 at 11:59 -0300, Avinash Kumar wrote:
> On Mon, Apr 20, 2020 at 11:39 AM Laurenz Albe <laurenz.albe@xxxxxxxxxxx> wrote:
> > On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> > > we can access Oracle table within Postgres?  How?  Need to create a database link
> > > and/or how can both talk to each other?
> >
> > That would be the Oracle foreign data wrapper:
> > https://laurenz.github.io/oracle_fdw/
>
> I echo. In my experience i have seen copy using oracle_fdw performing (almost 3 times
> faster than Ora2PG in some tests) a lot faster than what is existing today for the
> offline data copy. So, one may just avoid the longer route of generating a CSV
> and then looking for options to load it faster.
>
> However, for data validation - do you think we could do something like -> creating
> a md5 hash out of oracle (using dbms_crypto.hash()) foreign table and local postgres
> table (using md5()) ?
> What else do you think is the best to validate the data between Oracle and Postgres,
> other than Application and QA tests related to the App ?

If there is no error transferring the data, why should there be need for validation?
Unless you don't trust oracle_fdw of course.
I have seen perfect ORA errors or PG errors/exceptions when something goes wrong with copy using FDW. Oracle_FDW has no chance of making a mistake. However, the QA teams still have their Job, for various reasons, when some critical application data is being copied from Oracle to PG. So, it is difficult to deny testing, even if it was a solution like Oracle GG (GG does not do this job but as an example) etc. 

Calculating MD5-sums or similar of the data before and after would be difficult,
because for example timestamps are stored quite differently.
You'd have to convert the data into an identical text format on both systems,
and I cannot think of a good way to do that.
I think for this reason, we may just import the Oracle schema to a different schema, lets say -> scott_fdw. And then JOIN PG and corresponding foreign table and generate hash and validate. That may rule out the possibilities of different storage formats. Let me know if you still see anything going wrong. 

I think that everything beyond comparing the row count would be paranoid.
May be, if the above does not work.  

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Regards,
Avinash Vallarapu

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux