Re: How to troubleshoot rsync to cephfs via nfs-ganesha stalling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 
Hi Daniel, thanks for looking at this. 

These are the mount options
 type nfs4 
(rw,nodev,relatime,vers=4,intr,local_lock=none,retrans=2,proto=tcp,rsize
=8192,wsize=8192,hard,namlen=255,sec=sys)

I have overwritten the original files, so I cannot examine if they had 
holes. To be honest I don't even know how to query the file, to identify 
holes. 

These are the contents of the files, just plain text.
[@os0 CentOS7-x86_64]# cat CentOS_BuildTag
20181125-1500
[@os0 CentOS7-x86_64]# cat .discinfo
1543162572.807980
7.6
x86_64



-----Original Message-----
From: Daniel Gryniewicz [mailto:dang@xxxxxxxxxx] 
Sent: 10 December 2018 15:54
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  How to troubleshoot rsync to cephfs via 
nfs-ganesha stalling

This isn't something I've seen before.  rsync generally works fine, even 
over cephfs.  More inline.

On 12/09/2018 09:42 AM, Marc Roos wrote:
> 
> 
> This rsync command fails and makes the local nfs unavailable (Have to 
> stop nfs-ganesha, kill all rsync processes on the client and then 
> start
> nfs-ganesha)
> 
> rsync -rlptDvSHP --delete  --exclude config.repo --exclude "local*"
> --exclude "isos"
> anonymous@xxxxxxxxxxxxxxxxxxxxxxxxxxx::centos/7/os/x86_64/
> /localpath/CentOS7-x86_64/
> 
> When I do individual rsyncs on the subfolders
> 
> -rw-r--r-- 1 nobody 500   14 Nov 25 17:01 CentOS_BuildTag
> -rw-r--r-- 1 nobody 500   29 Nov 25 17:16 .discinfo
> drwxr-xr-x 3 nobody 500 8.3M Nov 25 17:20 EFI
> -rw-rw-r-- 1 nobody 500  227 Aug 30  2017 EULA
> -rw-rw-r-- 1 nobody 500  18K Dec  9  2015 GPL drwxr-xr-x 3 nobody 500 
> 572M Nov 25 17:21 images drwxr-xr-x 2 nobody 500  57M Dec  9 14:11 
> isolinux drwxr-xr-x 2 nobody 500 433M Nov 25 17:20 LiveOS drwxrwxr-x 2 

> nobody 500 9.5G Nov 25 16:58 Packages drwxrwxr-x 2 nobody 500  29M Dec 
 
> 9 13:53 repodata
> -rw-rw-r-- 1 nobody 500 1.7K Dec  9  2015 RPM-GPG-KEY-CentOS-7
> -rw-rw-r-- 1 nobody 500 1.7K Dec  9  2015 RPM-GPG-KEY-CentOS-Testing-7
> -rw-r--r-- 1 nobody 500  354 Nov 25 17:21 .treeinfo
> 
> These rsyncs are all going fine.
> 
> rsync -rlptDvSHP --delete  --exclude config.repo --exclude "local*"
> --exclude "isos"
> anonymous@xxxxxxxxxxxxxxxxxxxxxxxxxxx::centos/7/os/x86_64/Packages/
> /localpath/CentOS7-x86_64/Packages/
> rsync -rlptDvSHP --delete  --exclude config.repo --exclude "local*"
> --exclude "isos"
> anonymous@xxxxxxxxxxxxxxxxxxxxxxxxxxx::centos/7/os/x86_64/repodata/
> /localpath/CentOS7-x86_64/repodata/
> rsync -rlptDvSHP --delete  --exclude config.repo --exclude "local*"
> --exclude "isos"
> anonymous@xxxxxxxxxxxxxxxxxxxxxxxxxxx::centos/7/os/x86_64/LiveOS/
> /localpath/CentOS7-x86_64/LiveOS/
> 
> Except when I try to rsync the file CentOS_BuildTag then everything 
> stalls. Leaving such files
> -rw------- 1 500 500     0 Dec  9 14:26 .CentOS_BuildTag.2igwc5
> -rw------- 1 500 500     0 Dec  9 14:28 .CentOS_BuildTag.tkiwc5

So something is failing on the write, it seems.  These are the temporary 
files made by rsync, and they're empty, so the initial write seems to 
have failed.

> I can resolf this by doing a wget and moving the file to the location 
> wget 
> 
'http://mirror.ams1.nl.leaseweb.net/centos/7/os/x86_64/CentOS_BuildTag'
> mv CentOS_BuildTag /localpath/CentOS7-x86_64/
> 
> I had also problems with .discinfo and when I ls this directory on 
> cephfs mount it takes a long time to produce output.
> 
> When I do the full rsync to the cephfs mount it completes without 
> errors, when I then later do the sync on the nfs mount it completes 
> also (nothing being copied)

This confirms that it's not metadata related, as this second successful 
rsync is purely metadata.

> Anybody know what I should do to resolv this? Is this a typical 
> ganesha issue or is this cephfs corruption, that make ganesha stall?

Writes in Ganesha are pretty much passthrough, modulo some metadata 
tracking.  This means that a write hang is likely to be somewhere 
between Ganesha and CephFS.  However, this is a single, small file, so I 
don't see how it could hang, especially when wget can copy the file 
correctly.  Maybe there's something about the structure of the file? 
Does it have holes in it, for example?

Also, can you send the mount options for the NFS mount?

Daniel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux