Re: [BUGS] BUG #2737: hash indexing large table fails,while btree of same index works

Tom Lane <tgl@xxxxxxxxxxxxx> · Thu, 16 Nov 2006 17:48:16 -0500

"Simon Riggs" <simon@xxxxxxxxxxxxxxx> writes:
> On Fri, 2006-11-10 at 18:55 -0500, Tom Lane wrote:
>> One thought that comes to mind is to require hash to do an smgrextend()
>> addressing the last block it intends to use whenever it allocates a new
>> batch of blocks, whereupon md.c could adopt a saner API: allow
>> smgrextend but not other calls to address blocks beyond the current EOF.

> Yes, do it. 

I found out that it's easy to reproduce this failure in the regression
tests, just by building with RELSEG_SIZE set to 128K instead of 1G:

*** ./expected/create_index.out Sun Sep 10 13:44:25 2006
--- ./results/create_index.out  Thu Nov 16 17:33:29 2006
***************
*** 323,328 ****
--- 323,329 ----
  --
  CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
  CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops);
+ ERROR:  could not open segment 7 of relation 1663/16384/26989 (target block 145): No such file or directory
  CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops);
  CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops);
  -- CREATE INDEX hash_ovfl_index ON hash_ovfl_heap USING hash (x int4_ops);

AFAICS, any hash index exceeding a single segment is at serious risk.
The fact that we've not heard gripes before suggests that no one is
using gigabyte-sized hash indexes.

But it seems mighty late in the beta cycle to be making subtle changes
in the smgr API.  What I'm inclined to do for now is to hack
_hash_expandtable() to write a page of zeroes at the end of each file
segment when an increment in hashm_ovflpoint causes the logical EOF to
cross segment boundary(s).  This is pretty ugly and nonmodular, but it
will fix the bug without risking breakage of any non-hash code.
I'll revisit the cleaner solution once 8.3 devel begins.  Comments?

			regards, tom lane