In fact, read the whole thing, transfer it to the client and let the client handle the posix syntax.
Optionally add a path cache timeout client side that stores the directory listing for a period of time to mitigate the "php" dilemma for those types of use cases.
On September 13, 2014 12:02:55 PM PDT, manu@xxxxxxxxxx wrote:
In <1lrx1si.n8tms1igmi5pM%manu@xxxxxxxxxx> I explained why NetBSD
currently fails self-heald.t, but since the subjet is burried deep in a
thread, it might be worth starting a new one to talk about how to fix.
In 3 places within glusterfs code (features/index,
features/snapview-server and storage/posix), a server component answers
readdir requests on a directory which may be split in mulitple calls.
To answer one call, we have the following library calls:
- opendir()
- seekdir() to resume where the previous request was
- readdir()
- telldir() to record where we are for the next request
- closedir()
This relies on unspecified behavior, as POSIX says: "The value of loc
should have been returned from an earlier call to telldir() using the
same directory stream."
http://pubs.opengroup.org/onlinepubs/9699919799/functions/seekdir.html
Since we do opendir() and closedir() at each time, we do not use the
same directory stream. It causes an infinite loop on NetBSD because it
badly resume from previous request, and in the general case it will
break badly if an entry is added in the directory between two requests.
How can we fix that?
1) we can keep the directory stream open. The change is intrusive since
we will need a chained list of open contexts, and we need to clean them
if they timeout.
2) in order to keep state between requests, we can use the entry index
(first encoutered is 1, and so on) instead of values returned by
telldir(). That works around the unspecified behavior, but it still
breaks if directory content is changed between two requests
3) make sure the readdir is done in a single request. That means trying
with bigger buffers until it works. For instance in
xlator/cluster/afr/src/afr-self-heald.c we have:
while ((ret = syncop_readdir (subvol, fd, 131072, offset, &entries)))
We would use -1 instead of 131072 to tell that we want everything
without a size limit, and the server component (here features/index)
would either return everyting or fail, whithout playing with
telldir/seekdir.
Opinions? The third solution seems the best to me since it is not very
intrusive and it makes things simplier. Indeed we allow unbound data
size to come back from the brick to glustershd, but we trust the brick,
right?
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel