----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Joe Julian" <joe@xxxxxxxxxxxxxxxx> > Cc: "Gluster-devel@xxxxxxxxxxx" <gluster-devel@xxxxxxxxxxx>, pkoro@xxxxxxxxxxxx > Sent: Thursday, January 22, 2015 12:58:47 PM > Subject: Re: Quota problems without a way of fixing them > > > > ----- Original Message ----- > > From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx> > > To: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > > Cc: pkoro@xxxxxxxxxxxx, "Gluster-devel@xxxxxxxxxxx" > > <gluster-devel@xxxxxxxxxxx> > > Sent: Thursday, January 22, 2015 11:16:39 AM > > Subject: Re: Quota problems without a way of fixing them > > > > On 01/21/2015 09:32 PM, Raghavendra Gowdappa wrote: > > > > > > ----- Original Message ----- > > >> From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx> > > >> To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > > >> Cc: "Paschalis Korosoglou" <pkoro@xxxxxxxxxxxx> > > >> Sent: Thursday, January 22, 2015 12:54:44 AM > > >> Subject: Quota problems without a way of fixing them > > >> > > >> Paschalis (PeterA in #gluster) has reported these bugs and we've tried > > >> to > > >> find the source of the problem to no avail. Worse yet, there's no way to > > >> just reset the quotas to match what's actually there, as far as I can > > >> tell. > > >> > > >> What should we look for to isolate the source of this problem since this > > >> is a > > >> production system with enough activity to make isolating the repro > > >> difficult > > >> at best, and debug logs have enough noise to make isolation nearly > > >> impossible? > > >> > > >> Finally, isn't there some simple way to trigger quota to rescan a path > > >> to > > >> reset trusted.glusterfs.quota.size ? > > > 1. Delete following xattrs from all the files/directories on all the > > > bricks > > > a) trusted.glusterfs.quota.size > > > b) trusted.glusterfs.quota.*.contri > > > c) trusted.glusterfs.quota.dirty > > > > > > 2. Turn off md-cache > > > # gluster volume set <volname> performance.stat-prefetch off > > > > > > 3. Mount glusterfs asking not to use readdirp instead of readdir > > > # mount -t glusterfs -o use-readdirp=no <volfile-server>:<volfile-id> > > > /mnt/glusterfs > > > > > > 4. Do a crawl on the mountpoint > > > # find /mnt/glusterfs -exec stat \{} \; > /dev/null > > > > > > This should correct the accounting on bricks. Once done, you should see > > > correct values in quota list output. Please let us know if it doesn't > > > work > > > for you. > > > > But that could be a months-long process with the size of many of our > > users volumes. There should be a way to do this with a single directory > > tree. > > If you can isolate a sub-directory tree where size accounting has gone bad, But, the problem with this approach is that how do we know whether parents of this sub-directory have correct size. If a subdirectory has wrong size, then most likely accounting of all the ancestors of that sub-directory till root has gone bad. Hence I am skeptic about just healing "part" of a directory tree. > this can be done by setting xattr trusted.glusterfs.quota.dirty of a > directory to 1 and sending a lookup on that directory. Basically what this > does is to add sizes of all immediate children and set that as the value of > trusted.glusterfs.quota.size on the directory. But, the catch here is that > the sizes of immediate children need not be accounted correctly. Hence this > healing should be done bottom up starting with bottom-most directory and > healing towards the top-level subdirectory which is isolated. We can have an > algorithm like this: > > void > heal (char *path) > { > char value = 1; > struct stbuf = {0, }; > > setxattr (path, "trusted.glusterfs.quota.dirty", (const void *) > &value, sizeof (value)); > > /* now the dirty xattr has been set, trigger a lookup, so that the > directory is healed */ > stat (path, &stbuf); > > return; > } > > void > crawl (DIR *dirfd, char *path) > { > struct dirent *result = NULL, entry = {0, }; > > while (result = readdir (dirfd, &entry, NULL)) { > if (IA_ISDIR (result->d_type)) { > DIR *childfd = NULL; > char *childpath = NULL; > > childpath = construct_path (path, entry->d_name); > > childfd = opendir (entry->d_name); > > crawl (childfd, childpath); > } > } > > heal (dirfd); > > return; > } > > Now call crawl on isolated sub-directory (on the mountpoint). Note that above > is a psudo-code, and a tool should be written using the above algo. We'll > try to add a program to extras/utils which does this. > > > > > > > > >> His production system has been unmanageable for months now. It is > > >> possible > > >> for someone spare some cycles to get this looked at? > > >> > > >> 2013-03-04 - https://bugzilla.redhat.com/show_bug.cgi?id=917901 > > >> 2013-10-24 - https://bugzilla.redhat.com/show_bug.cgi?id=1023134 > > > We are working on these bugs. We'll update on the bugzilla once we find > > > anything substantial. > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel