Buggy writebehind translators !!!

Constantin Teodorescu <teo@xxxxxxx> · Sun, 24 Jun 2007 11:54:57 +0300

Finally , I succeeded in compiling the last patched version of glusterfs.

I succeeded in configuring and compiling the latest patched sources 
fetched with tla thought I had a problem with an older automake on my 
servers, the same archive has been autogen.sh-ed and configured just 
fine on the client computer with CentOS 5.0. So, I picked up the whole 
tree and ran ./configure on the final destination computer server, 
everything went OK.

Started the servers (3) , mounted locally the client, tried again the 
PostgreSQL database on the mounted disk.

No more "Transport endpoint is not connected" errors .. BUT the database 
cannot complete a simple import into a table complaining about some 
"unexpected data beyond EOF in a file block"

COPY animal (id, stare_animal_fk, rasa_fk, sex, data_inregistrare, 
data_nasterii, prima_exploatatie_fk, primul_proprietar_fk, cod_anarz, 
data_trecere_granita, tara_origine_fk, cod_crotalie_non_eu, 
crotalie_mama, observatii, data_upload, versiune) FROM stdin;
ERROR:  unexpected data beyond EOF in block 2480 of relation "animal"
HINT:  This has been seen to occur with buggy kernels; consider updating 
your system.

I tried 5 times, exactly the same error ! I suspect some data corruption 
when assembling data file blocks.

The client volume is configured with AFR , READAHEAD and WRITEBEHIND 
translators like this :

volume afr
 type cluster/afr
 subvolumes client1 client2 client3
 option replicate *:3
end-volume

volume writebehind
  type performance/write-behind
  option aggregate-size 131072 # aggregate block size in bytes
  subvolumes afr
end-volume

volume readahead
  type performance/read-ahead
  option page-size 131072 ### size in bytes
  option page-count 16 ### page-size x page-count is the amount of 
read-ahead data per file
  subvolumes writebehind
end-volume

I suspected that readahead & writebehind might have some problems so I 
commented them, leaving the afr volume alone, non-optimized.
The operations were obviously more slower but everything went OK, I 
tried multiple reads and updates, everything is OK now.

Then I tried to test just the writebehind translator in order to point 
exactly to the buggy code.
Introduced again just the writebehind translator , everything seems to 
work, imported the table, done 9 full updates on over 700.000 records, 
when tried to vacuum the table ... dang, another error :

glu=# update animal set observatii='ok1';
UPDATE 713268
glu=# update animal set observatii='ok2';
UPDATE 713268
.............
glu=# update animal set observatii='ok8';
UPDATE 713268
glu=# update animal set observatii='ok9';
UPDATE 713268
glu=# vacuum full analyze;
ERROR:  could not read block 69998 of relation 
531069804/531069805/531069806: File descriptor in bad state
glu=# vacuum full analyze;
ERROR:  could not read block 69998 of relation 
531069804/531069805/531069806: File descriptor in bad state

dropped the database, the files, cleaned everything, start it over again 
with a fresh and empty volumes, created the database again, imported the 
table, OK, updated 1 time, vacuum -> ERROR
glu=# update animal set observatii='ok1';
UPDATE 713268
glu=# vacuum full analyze;
ERROR:  could not read block 478 of relation 
531783093/531783094/531783095: File descriptor in bad state

Removed the wribehind translator, activate the readahed, done the same 
tests all and over again -> EVERYTHING IS OK !
So , the write-behind translator should be revised.
How can I help you in order to pinpoint the bug?

--
Constantin Teodorescu
Braila