On Wed, Apr 11, 2012 at 09:15:59AM +0200, Sidney Cadot wrote: > Dear all, > > As a hobby project, I am toying around with a database containing > about 5 million chess games. On average, these games have about 80 > positions (~ 40 moves by both black and white), which means there are > about 400 million chess positions in there. Sounds very interesting! > I am toying around with the ugly idea to make a "Positions" table that > has a single field for each of the squares, e.g. > > CREATE TABLE Position2 ( > GameIndex INTEGER NOT NULL, > PlyIndex INTEGER NOT NULL, > a1 "char" NOT NULL, > a2 "char" NOT NULL, > -- (60 fields defs omitted) > h7 "char" NOT NULL, > h8 "char" NOT NULL > ); > > This would allow the creation of indices on each of the 64 fields > separately, which should help to achieve near-instantaneous position > query performance, especially after gathering proper statistics for > all the field-specific indices. > > I realize that this design is quite ugly, so I would be interested to > hear if there are nicer alternatives that can perform equally well. You could instead create 64 partial indexes on the specific position: CREATE INDEX i_a1 ON Positions (substring(PseudoFenBoard FROM 1 FOR 1)) WHERE substring(PseudoFenBoard FROM 1 FOR 1) != ' '; CREATE INDEX i_a2 ON Positions (substring(PseudoFenBoard FROM 2 FOR 1)) WHERE substring(PseudoFenBoard FROM 2 FOR 1) != ' '; ... so that (a) you don't have to split PseudoFenBoard in more than 1 column, and (b) the indexes are smaller, because they don't index rows having that position empty (I am assuming that you never ask for games where a certain position is empty; if you do, then you need to remove the WHERE clause). > Also, above I use the 1-byte "char" type. Is this the only type in > PostGres that is guaranteed to be just a single byte, or are there > better alternatives? A 13-state enum would be best (listing the 6 > white pieces, 6 black pieces, and 'empty' states for every square on > the board) but as I understand from the documentation, enums always up > take 4 bytes per entry. I think using the 1-byte char is a fairly good choice; you could pack up your structure in a smaller bit string, but then you add complexity elsewhere and it might be desirable to keep things simple for now. Regards, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.ciolli@xxxxxxxxxxxxxx | www.2ndquadrant.it -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general