Emmanuel Dreyfus <manu@xxxxxxxxxx> wrote: > Here is the problem: once readdir() has reached the end of the > directory, on Linux, telldir() will report the last entry's offset, > while on NetBSD, it will report an invalid offset (it is in fact the > offset of the next entry beyond the last one, which does not exist). But that difference did not explain why NetBSD was looping. I discovered why. Between each index_fill_readdir() invocation, we have a closedir()/opendir() invocation. Then index_fill_readdir() calls seekdir() with a pointer obtained from telldir() on the previously open/closed DIR *. Offsets returned by telldir() are only valid for a DIR * lifetime [1]. Such rule makes sense: If the directory content changed, we are likely to return garbage. Now if the directory content did not change and we have readen everything, here is what happens: On Linux, seekdir() works with the offset obtained from previous DIR * (it does not have to according to the standards), and goes to the last entry. It exits gracefuly returning EOF. On NetBSD, seekdir() is given the offset from previous DIR * beyond the last entry. It fails and is nilpotent. Subsequent readdir_r() will operate from the beginning of the directory, and we never get EOF. Here is our infinite loop. The correct fix is: 1) either to keep the directory open between index_fill_readdir() invocations, but since that means preserving an open directory accross different syncop, I am not sure it is a good idea. 2) do not reuse the offset from last attempt. That means if the buffer get filled, resize it as bigger and retry, until the data fits. This is bad performance wise, but it seems the only safe way to me. Opinions? [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@xxxxxxxxxx _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel