Stale File Handle Errors During Heavy Writes

Timothy Orme <torme@xxxxxxxxxxxx> · Wed, 27 Nov 2019 01:38:12 +0000

Hi All,

I'm running a 3x2 cluster, v6.5.  Not sure if its relevant, but also have sharding enabled.

I've found that when under heavy write load, clients start erroring out with "stale file handle" errors, on files not related to the writes.

For instance, when a user is running a simple wc against a file, it will bail during that operation with "stale file"

When I check the client logs, I see errors like:

[2019-11-26 22:41:33.565776] E [MSGID: 109040] [dht-helper.c:1336:dht_migration_complete_check_task] 3-scratch-dht: 24d53a0e-c28d-41e0-9dbc-a75e823a3c7d: failed to lookup the file on scratch-dht 
[Stale file handle]

[2019-11-26 22:41:33.565853] W [fuse-bridge.c:2827:fuse_readv_cbk] 0-glusterfs-fuse: 33112038: READ => -1 gfid=147040e2-a6b8-4f54-8490-f0f3df29ee50 fd=0x7f95d8d0b3f8 (Stale file handle)

I've seen some bugs or other threads referencing similar issues, but couldn't really discern a solution from them.

Is this caused by some consistency issue with metadata while under load or something else?  I dont see the issue when heavy reads are occurrring.

Any help is greatly appreciated!

Thanks!

Tim

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users