Re: mkdir produces stale file handles

Stefan Solbrig <stefan.solbrig@xxxxx> · Thu, 19 Sep 2019 17:02:50 +0200

Thanks for the help!

> > Thanks for the quick answer!
> >
> > I think I can reduce data on the "full" bricks, solving the problem temporarily.
> >
> > The thing is, that the behavior changed from 3.12 to 6.5:   3.12 didn't have problems with almost full bricks, so I thought everything was fine. Then, after the upgrade, I ran into this problem. This might be a corner case that will go away once no-one uses 3.12 any more.
> >
> > But I think I can create a situation with 6.5 only that reproduces the error. Suppose I have a brick that 99% full.  So a write() will succeed. After the write, the brick can be 100% full, so a subsequent mkdir() will produce stale file handles (i.e., bricks that have different directory trees).  The funny thing is, that the mkdir() on the user side does not produce an error.   Clearly, no-one should ever let the file system get to 99%, but still, mkdir should fail then... 
> 
> I think there is a soft and hard limit that prevents creation of files/folders when a specific threshold is hit , but that threshold might be per brick instead of per replica set.

There is the cluster.min-free-disk, which states that the server should look for a free brick if the hash would place the file on a brick with less than "min-free-disk" bytes.   However, this seems to be a "should". If all bricks have less space than "min-free-disk", then the file is written anyway. 

Apart from that, I have some really large bricks (around 200 TB each), which means that if these are 99% full, then there are still 2 TB left (a signifikant amount).  The logic of "do not create a directory if the brick is 100% full" seems to be hard coded.  I didn't find a setting to disable this logic.

Nonethess, I think I can construct a test case where a sequence of write() and mkdir() would create stale file handles, even though all userland operations succeed.   Should I consider this a bug and make the effort to construct a test case?  (not on my production system, but on a toy model?  It will take me a few days...)

> > What remains:  is there a recommended way how to deal with the situation that I have some bricks that don't have all directories?
> 
> I think that you can mount the gluster volume and run a find with stat that will force a sync.
> find /rhev/mnt/full-path/directory-missing-on-some-bricks -iname '*' -exec stat {} \;

Thank you a lot! That indeed fixed the missing directories!   (I didn't know a "stat" triggers a sync of the bricks.)

best wishes,
Stefan

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users