DHT, pre-existing data unevenly distributed

dab at mail.nerc-essc.ac.uk (Dan Bretherton) · Fri, 16 Apr 2010 15:00:07 +0100

I have been using DHT to join together two large filesystems (2.5TB)
containing pre-exising data.  I solved the problem of ls not seeing
all the files by doing "rsync --dry-run" from the individual brick
directories to the glusterfs mounted volume.  I am using
glusterfs-3.0.2 and "option lookup-unhashed yes" for DHT on the
client.   All seemed to be well until the volume started to get nearly
full, and despite also using "option min-free-disk 10%" one of the
bricks became 100% full preventing any further writes to the whole
volume.  I managed to get going again by manually transferring some
data from one server to the other, making the two more evenly
balanced, but I would like to find a more permanent solution.  I would
also like to know if this sort of thing is supposed to happen with DHT
and pre-existing data, in the situation where data is not evenly
distributed across the bricks.  I have included my client volume file
at the bottom of this message.

I tried using the unify translator instead, even though it is
supposedly now obsolete, but glusterfs crashed (segfault) when I tried
to mount the volume.  I thought perhaps unify was no longer supported
in 3.0.2 so didn't pursue that option any further.  However, if unify
turns out to be better than DHT for pre-existing data situations I
will have to find out what went wrong.  Should I be using the unify
translator instead of DHT for pre-existing data that is unevenly
distributed across bricks?  If I can continue with DHT, can I stop
using "option lookup-unhashed yes" at some point?

Regards,
Dan Bretherton
####
## Client vol file
volume romulus
    type protocol/client
    option transport-type tcp
    option remote-host romulus
    option remote-port 6996
    option remote-subvolume brick1
end-volume

volume perseus
    type protocol/client
    option transport-type tcp
    option remote-host perseus
    option remote-port 6996
    option remote-subvolume brick1
end-volume

volume distribute
    type cluster/distribute
    option min-free-disk 10%
    option lookup-unhashed yes
    subvolumes romulus perseus
end-volume

volume io-threads
  type performance/io-threads
  #option thread-count 8  # default is 16
  subvolumes distribute
end-volume

volume io-cache
    type performance/io-cache
    option cache-size 1GB
    subvolumes io-threads
end-volume

volume main
  type performance/stat-prefetch
  subvolumes io-cache
end-volume

-- 
Mr. D.A. Bretherton
Reading e-Science Centre
Environmental Systems Science Centre
Harry Pitt Building
3 Earley Gate
University of Reading
Reading, RG6 6AL
UK

Tel. +44 118 378 7722
Fax: +44 118 378 6413