Re: Upgraded from 3.4.1 to 3.5.2, quota no longer working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






From: "David Gibbons" <david.c.gibbons@xxxxxxxxx>
To: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Tuesday, December 2, 2014 6:54:59 PM
Subject: Re: Upgraded from 3.4.1 to 3.5.2, quota no longer working

Thank you for the assistance.

Yesterday we started to have bricks on one server randomly crash. When the one server crashed, it would lock up the bricks on its replica as well. I ended up upgrading to 3.5.3, and noticed in the process that the libgfrpc and libgfxdr libraries were out of date on the server that was having crashed bricks. Upgrading to 3.5.3 and replacing the old versions of the libraries on the cranky server seems to have made everything happy again. 
Yes, one brick crash and a couple of other issues with quota that were reported in 3.5.2 were fixed in 3.5.3. Good to note that things are working fine now. :)
-Krutika
Thanks again!
Dave

On Tue, Dec 2, 2014 at 2:28 AM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:
Hi,

Are you sure the post-upgrade script ran to completion?
Here is one way to confirm whether that is the case: check if the quota configured directories have an xattr called "trusted.glusterfs.quota.limit-set" set on them in the respective bricks.

For example, here's what mine looks like:

[root@haddock 1]# pwd
/brick/1
[root@haddock 1]# getfattr -d -m . -e hex 1
# file: 1
security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000
trusted.gfid=0x57d0a561ca574d1cb0428f38d1c06e85
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x0000000000000a00
trusted.glusterfs.quota.dirty=0x3000
trusted.glusterfs.quota.limit-set=0x0000000006400000ffffffffffffffff
trusted.glusterfs.quota.size=0x0000000000000a00


where /brick/1 is the brick directory and under it "1" is the name of one of the quota-configured directories.


I believe your quota configurations are backed up at /var/tmp/glusterfs/quota-config-backup/vol_<volname> which you can use to get the quota-configured directory names.
As for operating version, I think it is sufficient for it to be at 3 for the 3.5.x quota to work.

-Krutika


From: "David Gibbons" <david.c.gibbons@xxxxxxxxx>
To: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Monday, December 1, 2014 6:35:55 PM
Subject: Re: Upgraded from 3.4.1 to 3.5.2, quota no longer working


Certainly, thank you for your response:

Quotad is running on all nodes:
[root@gfs-a-1 ~]# ps aux | grep quotad
root      3004  0.0  0.4 241368 68552 ?        Ssl  Nov30   0:07 /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/lib/glusterd/quotad/run/quotad.pid -l /usr/local/var/log/glusterfs/quotad.log -S /var/run/9d02605105ef0e74d913a4671c1143a1.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off

And the relevant output from gluster volume status shares per your request:
[root@gfs-a-1 ~]# gluster volume status shares | grep Quota
Quota Daemon on localhost                               N/A     Y       3004
Quota Daemon on gfs-a-3                                 N/A     Y       32307
Quota Daemon on gfs-a-4                                 N/A     Y       10818
Quota Daemon on gfs-a-2                                 N/A     Y       12292
 No log entries are created in /var/log/glusterfs/quotad.log when I run a quota list; all of the log entries are from yesterday. They do indicate a version mis-match, although I can't seem to locate where that version is specified:
[2014-11-30 13:21:55.173081] I [client-handshake.c:1474:client_setvolume_cbk] 0-shares-client-14: Server and Client lk-version numbers are not same, reopening the fds
[2014-11-30 13:21:55.173170] I [client-handshake.c:450:client_set_lk_version_cbk] 0-shares-client-14: Server lk version = 1
[2014-11-30 13:21:55.178739] I [rpc-clnt.c:1729:rpc_clnt_reconfig] 0-shares-client-9: changing port to 49154 (from 0)
[2014-11-30 13:21:55.181170] I [client-handshake.c:1677:select_server_supported_programs] 0-shares-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2014-11-30 13:21:55.181386] I [client-handshake.c:1462:client_setvolume_cbk] 0-shares-client-9: Connected to 172.16.10.13:49154, attached to remote volume '/mnt/a-3-shares-brick-3/brick'.
[2014-11-30 13:21:55.181401] I [client-handshake.c:1474:client_setvolume_cbk] 0-shares-client-9: Server and Client lk-version numbers are not same, reopening the fds
[2014-11-30 13:21:55.181535] I [client-handshake.c:450:client_set_lk_version_cbk] 0-shares-client-9: Server lk version = 1

I see the operational mode for the volume as "3". I saw a non-related thread that indicated this number should be more digits on a cluster running 3.5.2. The other thread also indicated that quota may not work if the volume version number was not compatible with the quota version running on the cluster. I can't seem to find the link right now.

It's almost as if the volume version did not get upgraded when the server version was upgraded. Is that possible?

Cheers,
Dave


On Sun, Nov 30, 2014 at 11:46 PM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:
Hi,

Could you confirm whether quotad (Quota Daemon) is online from the output of `gluster volume status shares`?

Also, could you share quota daemon's log file from the node where you executed `quota list` command, which you will find at @ /var/log/glusterfs/quotad.log?

-Krutika


From: "David Gibbons" <david.c.gibbons@xxxxxxxxx>
To: "gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Sunday, November 30, 2014 9:57:02 PM
Subject: Upgraded from 3.4.1 to 3.5.2,        quota no longer working


Hi All,

I performed a long-awaited upgrade from 3.4.1 to 3.5.2 today following the instructions for an offline upgrade outlined here:

I ran the pre- and post- upgrade scripts as instructed, intending to move the quotas over to the new version. The upgrade seemed to go well, the volume is online and it appears to be functioning properly.

When I attempt to check quotas, the list is empty:
[root@gfs-a-1 glusterfs]# gluster volume quota shares list
                  Path                   Hard-limit Soft-limit   Used  Available
--------------------------------------------------------------------------------
[root@gfs-a-1 glusterfs]#


And upon execution of that command, the cli.log file fills up with entries like this. I am assuming it's one cli log entry per quota entry:
[2014-11-30 14:00:02.154143] W [cli-rpc-ops.c:2469:print_quota_list_from_quotad] 0-cli: path key is not present in dict
[2014-11-30 14:00:02.160507] W [cli-rpc-ops.c:2469:print_quota_list_from_quotad] 0-cli: path key is not present in dict
[2014-11-30 14:00:02.167947] W [cli-rpc-ops.c:2469:print_quota_list_from_quotad] 0-cli: path key is not present in dic

So, it appears that somehow the quota database has become offline or corrupt. Any thoughts on what I can do to resolve this?

I have checked all of the binaries on all 4 machines in the cluster, and they all appear to be running the correct version:
[root@gfs-a-1 glusterfs]# glusterfsd --version
glusterfs 3.5.2 built on Nov 30 2014 08:16:37

Cheers,
Dave 

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux