Re: Stale file handle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the update! 

On Fri, 13 Mar, 2020, 9:40 PM Pat Haley, <phaley@xxxxxxx> wrote:

Hi All,

After performing Strahil's checks and poking around some more, we found
that the problem was with the underlying filesystem thinking it was full
when it wasn't.  Following the information in the links below, we found
that mounting with 64bit inodes fixed this problem.

https://serverfault.com/questions/357367/xfs-no-space-left-on-device-but-i-have-850gb-available

https://support.microfocus.com/kb/doc.php?id=7014318

Thanks

Pat


On 3/12/20 4:24 PM, Strahil Nikolov wrote:
> On March 12, 2020 8:06:14 PM GMT+02:00, Pat Haley <phaley@xxxxxxx> wrote:
>> Hi
>>
>> Yesterday we seemed to clear an issue with erroneous "No space left on
>> device" messages
>> (https://lists.gluster.org/pipermail/gluster-users/2020-March/037848.html)
>>
>> I am now seeing "Stale file handle" messages coming from directories
>> I've just created.
>>
>> We are running gluster 3.7.11 in a distributed volume across 2 servers
>> (2 bricks each). For the "Stale file handle" for a newly created
>> directory, I've noticed that the directory does not appear in brick1
>> (it
>> is in the other 3 bricks).
>>
>> In the cli.log on the server with brick1 I'm seeing messages like
>>
>> --------------------------------------------------------
>> [2020-03-12 17:21:36.596908] I [cli.c:721:main] 0-cli: Started running
>> gluster with version 3.7.11
>> [2020-03-12 17:21:36.604587] I
>> [cli-cmd-volume.c:1795:cli_check_gsync_present] 0-: geo-replication not
>>
>> installed
>> [2020-03-12 17:21:36.605100] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>>
>> with index 1
>> [2020-03-12 17:21:36.605155] I [socket.c:2356:socket_event_handler]
>> 0-transport: disconnecting now
>> [2020-03-12 17:21:36.617433] I [input.c:36:cli_batch] 0-: Exiting with:
>> 0
>> --------------------------------------------------------
>>
>> I'm not sure why I would be getting any geo-replication messages, we
>> aren't using replication. The cli.log on the other server is showing
>>
>> --------------------------------------------------------
>> [2020-03-12 17:27:08.172573] I [cli.c:721:main] 0-cli: Started running
>> gluster with version 3.7.11
>> [2020-03-12 17:27:08.302564] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>>
>> with index 1
>> [2020-03-12 17:27:08.302716] I [socket.c:2356:socket_event_handler]
>> 0-transport: disconnecting now
>> [2020-03-12 17:27:08.304557] I [input.c:36:cli_batch] 0-: Exiting with:
>> 0
>> --------------------------------------------------------
>>
>>
>> On the server with brick1, the etc-glusterfs-glusterd.vol.log is
>> showing
>>
>> --------------------------------------------------------
>> [2020-03-12 17:21:25.925394] I [MSGID: 106499]
>> [glusterd-handler.c:4331:__glusterd_handle_status_volume] 0-management:
>>
>> Received status volume req for volume data-volume
>> [2020-03-12 17:21:25.946240] W [MSGID: 106217]
>> [glusterd-op-sm.c:4630:glusterd_op_modify_op_ctx] 0-management: Failed
>> uuid to hostname conversion
>> [2020-03-12 17:21:25.946282] W [MSGID: 106387]
>> [glusterd-op-sm.c:4734:glusterd_op_modify_op_ctx] 0-management: op_ctx
>> modification failed
>> [2020-03-12 17:21:36.617090] I [MSGID: 106487]
>> [glusterd-handler.c:1472:__glusterd_handle_cli_list_friends]
>> 0-glusterd:
>> Received cli list req
>> [2020-03-12 17:21:15.577829] I [MSGID: 106488]
>> [glusterd-handler.c:1533:__glusterd_handle_cli_get_volume] 0-glusterd:
>> Received get vol req
>> --------------------------------------------------------
>>
>> On the other server I'm seeing similar messages
>>
>> --------------------------------------------------------
>> [2020-03-12 17:26:57.024168] I [MSGID: 106499]
>> [glusterd-handler.c:4331:__glusterd_handle_status_volume] 0-management:
>>
>> Received status volume req for volume data-volume
>> [2020-03-12 17:26:57.037269] W [MSGID: 106217]
>> [glusterd-op-sm.c:4630:glusterd_op_modify_op_ctx] 0-management: Failed
>> uuid to hostname conversion
>> [2020-03-12 17:26:57.037299] W [MSGID: 106387]
>> [glusterd-op-sm.c:4734:glusterd_op_modify_op_ctx] 0-management: op_ctx
>> modification failed
>> [2020-03-12 17:26:42.025200] I [MSGID: 106488]
>> [glusterd-handler.c:1533:__glusterd_handle_cli_get_volume] 0-glusterd:
>> Received get vol req
>> [2020-03-12 17:27:08.304267] I [MSGID: 106487]
>> [glusterd-handler.c:1472:__glusterd_handle_cli_list_friends]
>> 0-glusterd:
>> Received cli list req
>> --------------------------------------------------------
>>
>> And I've just noticed that I'm again seeing "No space left on device"
>> in
>> the logs of brick1 (although there is 3.5 TB free)
>>
>> --------------------------------------------------------
>> [2020-03-12 17:19:54.576597] E [MSGID: 113027]
>> [posix.c:1427:posix_mkdir] 0-data-volume-posix: mkdir of
>> /mnt/brick1/projects/deep_sea_mining/Tide/2020/Mar06/ccfzR75deg_001
>> failed [No space left on device]
>> [2020-03-12 17:19:54.576681] E [MSGID: 115056]
>> [server-rpc-fops.c:512:server_mkdir_cbk] 0-data-volume-server: 5001698:
>>
>> MKDIR /projects/deep_sea_mining/Tide/2020/Mar06/ccfzR75deg_001
>> (96e0b7e4-6b43-42ef-9896-86097b4208fe/ccfzR75deg_001) ==> (No space
>> left
>> on device) [No space left on device]
>> --------------------------------------------------------
>>
>> Any thoughts would be greatly appreciated.  (Some additional
>> information
>> below)
>>
>> Thanks
>>
>> Pat
>>
>> --------------------------------------------------------
>> server 1:
>> [root@mseas-data2 ~]# df -h
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/sdb        164T  161T  3.5T  98% /mnt/brick2
>> /dev/sda        164T  159T  5.4T  97% /mnt/brick1
>>
>> [root@mseas-data2 ~]# df -i
>> Filesystem         Inodes    IUsed      IFree IUse% Mounted on
>> /dev/sdb       7031960320 31213790 7000746530    1% /mnt/brick2
>> /dev/sda       7031960320 28707456 7003252864    1% /mnt/brick1
>> --------------------------------------------------------
>>
>> --------------------------------------------------------
>> server 2:
>> [root@mseas-data3 ~]# df -h
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/sda               91T   88T  3.9T  96% /export/sda/brick3
>> /dev/mapper/vg_Data4-lv_Data4
>>                         91T   89T  2.6T  98% /export/sdc/brick4
>>
>> [root@mseas-data3 glusterfs]# df -i
>> Filesystem               Inodes    IUsed      IFree IUse% Mounted on
>> /dev/sda             1953182464 10039172 1943143292    1%
>> /export/sda/brick3
>> /dev/mapper/vg_Data4-lv_Data4
>>                       3906272768 11917222 3894355546    1%
>> /export/sdc/brick4
>> --------------------------------------------------------
>>
>> --------------------------------------------------------
>> [root@mseas-data2 ~]# gluster volume info
>> --------------------------------------------------------
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Brick3: mseas-data3:/export/sda/brick3
>> Brick4: mseas-data3:/export/sdc/brick4
>> Options Reconfigured:
>> cluster.min-free-disk: 1%
>> nfs.export-volumes: off
>> nfs.disable: on
>> performance.readdir-ahead: on
>> diagnostics.brick-sys-log-level: WARNING
>> nfs.exports-auth-enable: on
>> server.allow-insecure: on
>> auth.allow: *
>> disperse.eager-lock: off
>> performance.open-behind: off
>> performance.md-cache-timeout: 60
>> network.inode-lru-limit: 50000
>> diagnostics.client-log-level: ERROR
>>
>> --------------------------------------------------------
>> [root@mseas-data2 ~]# gluster volume status data-volume detail
>> --------------------------------------------------------
>> Status of volume: data-volume
>> ------------------------------------------------------------------------------
>> Brick                : Brick mseas-data2:/mnt/brick1
>> TCP Port             : 49154
>> RDMA Port            : 0
>> Online               : Y
>> Pid                  : 4601
>> File System          : xfs
>> Device               : /dev/sda
>> Mount Options        : rw
>> Inode Size           : 256
>> Disk Space Free      : 5.4TB
>> Total Disk Space     : 163.7TB
>> Inode Count          : 7031960320
>> Free Inodes          : 7003252864
>> ------------------------------------------------------------------------------
>> Brick                : Brick mseas-data2:/mnt/brick2
>> TCP Port             : 49155
>> RDMA Port            : 0
>> Online               : Y
>> Pid                  : 7949
>> File System          : xfs
>> Device               : /dev/sdb
>> Mount Options        : rw
>> Inode Size           : 256
>> Disk Space Free      : 3.4TB
>> Total Disk Space     : 163.7TB
>> Inode Count          : 7031960320
>> Free Inodes          : 7000746530
>> ------------------------------------------------------------------------------
>> Brick                : Brick mseas-data3:/export/sda/brick3
>> TCP Port             : 49153
>> RDMA Port            : 0
>> Online               : Y
>> Pid                  : 4650
>> File System          : xfs
>> Device               : /dev/sda
>> Mount Options        : rw
>> Inode Size           : 512
>> Disk Space Free      : 3.9TB
>> Total Disk Space     : 91.0TB
>> Inode Count          : 1953182464
>> Free Inodes          : 1943143292
>> ------------------------------------------------------------------------------
>> Brick                : Brick mseas-data3:/export/sdc/brick4
>> TCP Port             : 49154
>> RDMA Port            : 0
>> Online               : Y
>> Pid                  : 23772
>> File System          : xfs
>> Device               : /dev/mapper/vg_Data4-lv_Data4
>> Mount Options        : rw
>> Inode Size           : 256
>> Disk Space Free      : 2.6TB
>> Total Disk Space     : 90.9TB
>> Inode Count          : 3906272768
>> Free Inodes          : 3894355546
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley                          Email:  phaley@xxxxxxx
>> Center for Ocean Engineering       Phone:  (617) 253-6824
>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> https://lists.gluster.org/mailman/listinfo/gluster-users
> Hey Pat,
>
> The logs are not  providing  much information  ,  but the following seems strange:
> 'Failed uuid to hostname conversion'
>
> Have you checked  dns resolution (both short name and fqdn)?
> Also,  check the systems' ntp/chrony is in sync  and the  'gluster peer  status'  on all nodes.
>
> Is it possible that the  client  is not reaching all  bricks  ?
>
>
> P.S.:  Consider  increasing the log level,  as  current level is not sufficient.
>
> Best Regards,
> Strahil Nikolov

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley@xxxxxxx
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

________



Community Meeting Calendar:

Schedule -
Every Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________



Community Meeting Calendar:

Schedule -
Every Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux