> On 2010-12-21 10:42, Massa, Harald Armin wrote: >> b) creating an index requires to read the data-to-be-indexed. So, to >> have an >> index pointing at the interesting rows for your query, the table has to >> be >> read ... which would be the perfect time to allready select the >> interesting >> rows. And after having the interesting rows: the index is worthless > > ... until another similar query comes along, when suddenly it's a massive > win. > Why not auto-create indices for some limited period after database load > (copy? any large number of inserts from a single connection?), track > those > that actually get re-used and remove the rest? Would this not provide > a better out-of-the-box experience for neophytes? Say you have a table with several columns (A,B,C), and the query is using some of them. What indexes would you create? One index on every column? A multi-column index on all columns? Indexs for each combination of columns? There really is no automatic way to solve this puzzle using a single query. Indexing strategy is a very tough design discipline, and it requires a complex knowledge of the workload. One slow query does not mean the index should be created - what if that was just an ad-hoc query and will not be executed ever again? So although this 'automatic index creation' seems nice, it really does not work in practice. I really don't want to work with products that try to be smarter than me (and I admit I'm pretty dumb from time to time) and act rather randomly due to this 'artificial intelligence'. I've already been there and I don't want to repeat this experience. > > [...] >> Why is the query planner not allowed to create indexes, but only allowed >> to >>> use or not use what's available? >>> >> >> as in b): Creating an index is quite expensiv > > How much more so than doing that full-table-scan plus sort, which your > query is doing anyway? A lot. It consumes a lot of CPU, it may consume a lot of memory (up to maintenance_work_mem) etc. So imagine a few users, building indices on a big table simultaneously - no, I really don't want to see this on a production server. Building an index is just one side of the problem - maintenance of the indexes is another thing. Each index has an impact on write operations (INSERT/UPDATE) and may cause that HOT actually does not work (thus causing unnecessary bloat of the index). And the indexes may need lot of space on a disk. But the real show stopper is probably locking. Building an index takes a write lock on a table, effectively blocking writes. Sure, you can use a 'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some disadvantages of that (see http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY). cheers Tomas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general