Search Postgresql Archives

Re: automated 'discovery' of a table : potential primary key, columns functional dependencies ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/22/19 2:05 PM, Rémi Cura wrote:
Hello dear List,
I'm currently wondering about how to streamline the normalization of a new table.

I often have to import messy CSV files into the database, and making clean normalized version of these takes me a lot of time (think dozens of columns and millions of rows).

To me messy means the information to do the below is not available. Personally I think you best bet is to get the data into tables and then use visualization tools to help you determine the below. My guess is there will be a lot of data cleaning going on before you can get to a well ordered table layout.


I wrote some code to automatically import a CSV file and infer the type of each column.
Now I'd like to quickly get an idea of
  - what would be the most likely primary key
  - what are the functional dependencies between the columns

The goal is **not** to automate the modelling process,
but rather to automate the tedious phase of information collection
that is necessary for the DBA to make a good model.

If this goes well, I'd like to automate further tedious stuff (like splitting a table into several ones with appropriate foreign keys / constraints)

I'd be glad to have some feedback / pointers to tools in plpgsql or even plpython.

Thank you very much
Remi




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux