On Friday 15. February 2008, fdu.xiaojf@xxxxxxxxx wrote: >Hi all, > >I have a large sdf file with many records of molecules and associated >data items and I want to save them in a PostgreSQL database. There are >about less than 40 data items for every molecule(the names of the data >items fore each molecule are the same, but values differ). The number >of molecules may exceed 20 million. > >Now I have come up with two ways to construct the table: > >1) a table with about 40 columns, every data item has its > corresponding column, and one molecule corresponds to one row in the > table. > >This is direct and simple. The drawbacks is if I want to add more data >types to the database, I have to modify the structure of the table. > >2) a table with just 3 columns: > > CREATE TABLE mytable( > id serial, > data_name text, > data_value text > ); > >Then a single molecule will corresonds to about 40 rows in the > database. This is a sound concept, but I'd rather store the data_name in a separate table with an integer key, and replace data_name in mytable with a data_name_fk REFERENCES data_names (data_name_id). That's just Occam's Razor applied to database design, aka first normal form. You'd probably store the name of the molecule in a third table. Then you have a model very similar to the classic 'book database' where a book can have multiple authors, and an author can have multiple books. There are examples for this design all over the place. >If I need to add more data types to the table, I just need to add new >rows with new "data_name" column values. The drawback of this table is >it has too many rows(40 times of the former one) and waste a lot > space. > >Which one is better, or there are some other smart ways ? > >I have another question. Since the data I need to save is huge, is it >appropriate that I save the data value in compressed format ? That sounds a lot like premature optimization. Postgres is actually quite good at compacting data natively. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE My Jazz Jukebox: http://www.last.fm/user/leifbk/ ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings