On Tue, Jul 3, 2012 at 10:43 PM, Craig Ringer <ringerc@xxxxxxxxxxxxx> wrote: > On 07/04/2012 07:13 AM, Reza Taheri wrote: > > Following the earlier email introducing the TPC-V benchmark, and that we are > developing an industry standard benchmarking kit for TPC-V using PostgreSQL, > here is a specific performance issue we have run into. > > > Which version of PostgreSQL are you using? > > How has it been tuned beyond the defaults - autovacuum settings, > shared_buffers, effective_cache_size, WAL settings, etc? > > How much RAM is on the blade? What OS and version are on the blade? > > > Comparing the table sizes, we are close to 2X larger (more on this in a > later note). But the index size is what stands out. Our overall index usage > (again, after accounting for different numbers of rows) is 4.8X times > larger. 35% of our I/Os are to the index space. I am guessing that the 4.8X > ballooning has something to do with this, and that in itself explains a lot > about our high I/O rate, as well as higher CPU/tran cycles compared to MS > SQL (we are 2.5-3 times slower). > > This is making me wonder about bloat issues and whether proper vacuuming is > being done. If the visibility map and free space map aren't maintained by > proper vaccum operation everything gets messy, fast. > > Well, MS SQL used a “clustered index” for CT, i.e., the data is held in the > leaf pages of the index B-Tree. The data and index are in one data > structure. Once you lookup the index, you also have the data at zero > additional cost. > > [snip] > > > > Is the PGSQL community willing to invest in a feature that a) has been > requested by many others already; and b) can make a huge difference in a > benchmark that can lend substantial credibility to PGSQL performance? > > > while PostgreSQL doesn't support covering indexes or clustered indexes at > this point, 9.2 has added support for index-only scans, which are a half-way > point of sorts. See: > > http://rhaas.blogspot.com.au/2011/10/index-only-scans-weve-got-em.html > http://rhaas.blogspot.com.au/2010/11/index-only-scans.html > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a2822fb9337a21f98ac4ce850bb4145acf47ca27 > > If at all possible please see how your test is affected by this PostgreSQL > 9.2 enhancement. It should make a big difference, and if it doesn't it's > important to know why. > > (CC'd Robert Haas) > > I'm not sure what the best option for getting a 9.2 beta build for Windows > is. > > > As for the "invest" side - that's really a matter for EnterpriseDB, Command > Prompt, Red Hat, and the other backers who're employing people to work on > the DB. Consider asking on pgsql-hackers, too; if nothing else you'll get a > good explanation of the current state and progress toward clustered indexes. > > Some links that may be useful to you are: > > http://wiki.postgresql.org/wiki/Todo > Things that it'd be good to support/implement at some point. Surprisingly, > covering/clustered indexes aren't on there or at least aren't easily found. > It's certainly a much-desired feature despite its apparent absence from the > TODO. I think there is, deservingly, a lot of hesitation to implement a strictly ordered table construct. A similar feature that didn't quite get finished -- but maybe can be beaten into shape -- is the grouped-index-tuple implementation: http://community.enterprisedb.com/git/ It is mentioned on the TODO page. It's under the category that is perhaps poorly syntactically overloaded in the world "cluster". -- fdr -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance