incomplete listing of a directory, sometimes getdents loops until out of memory

vbellur at redhat.com (Vijay Bellur) · Fri, 14 Jun 2013 10:35:53 -0400

On 06/13/2013 03:38 PM, John Brunelle wrote:
> Hello,
>
> We're having an issue with our distributed gluster filesystem:
>
> * gluster 3.3.1 servers and clients
> * distributed volume -- 69 bricks (4.6T each) split evenly across 3 nodes
> * xfs backend
> * nfs clients
> * nfs.enable-ino32: On
>
> * servers: CentOS 6.3, 2.6.32-279.14.1.el6.centos.plus.x86_64
> * cleints: CentOS 5.7, 2.6.18-274.12.1.el5
>
> We have a directory containing 3,343 subdirectories.  On some clients,
> ls lists only a subset of the directories (a different amount on
> different clients).  On others, ls gets stuck in a getdents loop and
> consumes more and more memory until it hits ENOMEM.  On yet others, it
> works fine.  Having the bad clients remount or drop caches makes the
> problem temporarily go away, but eventually it comes back.  The issue
> sounds a lot like bug #838784, but we are using xfs on the backend,
> and this seems like more of a client issue.

Turning on "cluster.readdir-optimize" can help readdir when a directory 
contains a number of sub-directories and there are more bricks in the 
volume. Do you observe any change with this option enabled?

-Vijay