I faced the same issue with the sharding translator. I fixed it by making its readdirp callback initialize individual entries' inode ctx, some of these being xattr values, which are filled in entry->dict by the posix translator.
Here is the patch that got merged recently: http://review.gluster.org/11854
Would that be as easy to do in DHT as well?
As far as AFR is concerned, it indirectly forces LOOKUP on entries which are being retrieved for the first time through a READDIRP (and as a result do not have their inode ctx etc initialised yet) by setting entry->inode to NULL. See afr_readdir_transform_entries().
This is the default behavior which is being made optional as part of http://review.gluster.org/#/c/11846/ which is still under review (see BZ 1250803, a performance bug :) ).
-Krutika
From: "Mohammed Rafi K C" <rkavunga@xxxxxxxxxx>
To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
Cc: "Dan Lambright" <dlambrig@xxxxxxxxxx>, "Nithya Balachandran" <nbalacha@xxxxxxxxxx>, "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, "Ben Turner" <bturner@xxxxxxxxxx>, "Ben England" <bengland@xxxxxxxxxx>, "Manoj Pillai" <mpillai@xxxxxxxxxx>, "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>, "Ravishankar Narayanankutty" <ranaraya@xxxxxxxxxx>, kdhananj@xxxxxxxxxx, xhernandez@xxxxxxxxxx
Sent: Wednesday, August 12, 2015 7:29:48 PM
Subject: Inconsistent behavior due to lack of lookup on entry followed by readdirpHi All,We are facing some inconsistent behavior for fops like rename, unlink
etc due to lack of lookup followed by a readdirp, more specifically if
inodes/gfid are populated via readdirp call and this nodeid is shared
with kernal, md-cache will cache this based on base-name. Then
subsequent named lookup will be served from md-cache and it winds-back
immediately. So there is a chance to have an FOP triggered with out
having a lookup on an entry. DHT does lot of things like creating link
files and populate inode_ctx etc, during lookup. In such scenario it is
must to have at least one lookup to be happened on an entry. Since
readdirp preventing the lookup, it has been very hard for fops to
proceed without a first lookup on the entry. We are also suspecting some
problems due to same with afr/ec self healing also. So If we remove
readdirp from md-cache ([1], [2]) it causes, an additional hop for first
lookup for every entry. I'm mostly concerned with this one extra network
call, and the performance degradation caused by the same.Now with this, the only advantage with readdirp is, it removes one
context switch between kernal and userspace. Is it really worth to
sacrifice this for consistency ?What do you think about removing readdirp functionality?Please provide your input/suggestion/ideas.[1] : http://review.gluster.org/#/c/11892/[2] : http://review.gluster.org/#/c/11894/Thanks in Advance
Rafi KC
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel