Re: ln: failed to create hard link 'file name': Read-only file system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/03/2023 23:41, Gregory Farnum wrote:
On Wed, Mar 22, 2023 at 8:27 AM Frank Schilder <frans@xxxxxx> wrote:

Hi Gregory,

thanks for your reply. First a quick update. Here is how I get ln to work
after it failed, there seems no timeout:

$ ln envs/satwindspy/include/ffi.h
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h
ln: failed to create hard link
'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h': Read-only file system
$ ls -l envs/satwindspy/include mambaforge/pkgs/libffi-3.3-h58526e2_2
envs/satwindspy/include:
total 7664
-rw-rw-r--.   1 rit rit    959 Mar  5  2021 ares_build.h
[...]
$ ln envs/satwindspy/include/ffi.h
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h

After an ls -l on both directories ln works.

To the question: How can I pull out a log from the nfs server? There is
nothing in /var/log/messages.

So you’re using the kernel server and re-exporting, right?

I’m not very familiar with its implementation; I wonder if it’s doing
something strange via the kernel vfs.
AFAIK this isn’t really supportable for general use because nfs won’t
respect the CephFS file consistency protocol. But maybe it’s trying a bit
and that’s causing trouble?

Yeah, I think you are right Greg.

Checked the logs uploaded by Frank. I found that the kclient just send one request like:

++++++

2023-03-27T23:24:37.866+0200 7f0c1a0d1700  7 mds.0.server dispatch_client_request client_request(client.186555:475421 link #0x10000682337/liblz4.so.1.9.3 #0x1000066d6d8// 2023-03-27T23:24:37.864907+0200 caller_uid=1000, caller_gid=1000{4,24,27,30,46,122,134,135,1000,}) v4 2023-03-27T23:24:37.866+0200 7f0c1a0d1700  7 mds.0.server handle_client_link #0x10000682337/liblz4.so.1.9.3 to #0x1000066d6d8// 2023-03-27T23:24:37.866+0200 7f0c1a0d1700 10 mds.0.server rdlock_two_paths_xlock_destdn request(client.186555:475421 nref=2 cr=0x5601bbc60500) #0x10000682337/liblz4.so.1.9.3 #0x1000066d6d8// 2023-03-27T23:24:37.866+0200 7f0c1a0d1700  7 mds.0.server reply_client_request -30 ((30) Read-only file system) client_request(client.186555:475421 link #0x10000682337/liblz4.so.1.9.3 #0x1000066d6d8// 2023-03-27T23:24:37.864907+0200 caller_uid=1000, caller_gid=1000{4,24,27,30,46,122,134,135,1000,}) v4

------

The kclient just set the src dentry to "#0x1000066d6d8//". While the mds will parse the "//" as a snapdir, which is readonly. This is why mds return a -EROFS error.

But from mds logs we can see that the "0x1000066d6d8" is "/data/nfs/envs/satwindspy/lib/liblz4.so.1.9.3":

++++++

2023-03-27T23:24:37.866+0200 7f0c1a0d1700  7 mds.0.locker issue_caps allowed=pAsLsXsFscrl, xlocker allowed=pAsLsXsFscrl on [inode 0x1000066d6d8 [...7b,head] /data/nfs/envs/satwindspy/lib/liblz4.so.1.9.3 auth v7035 snaprealm=0x55fe3785e500 s=215880 nl=2 n(v0 rc2023-03-27T23:15:22.568391+0200 b215880 1=1+0) (iversion lock) caps={186555=pAsXsFscr/-@3} | ptrwaiter=0 request=0 lock=0 caps=1 remoteparent=1 dirtyparent=0 dirty=0 authpin=0 0x5601b7174800]

------


Then from the kernel debug logs:

++++++

31358125 [16380611.812642] ceph:  do_request mds0 session 00000000a66983cb state open 31358126 [16380611.812644] ceph:  __prepare_send_request 000000001ebc34fd tid 475421 link (attempt 1) 31358127 [16380611.812647] ceph:   dentry 000000006cbb0f2e 10000682337/liblz4.so.1.9.3
31358128 [16380611.812649] ceph:   dentry 00000000126d4660 1000066d6d8//

------

We can see that the kclient set the src dentry to "1000066d6d8//".

This is incorrect and it should be "1000066d2e3/liblz4.so.1.9.3", which the "1000066d2e3" is the parent dir's inode and the path is "/data/nfs/envs/satwindspy/lib/".

From the fs/ceph/dir.c code, we can see that the ceph_link() will parse the src dentry:

2735 static int build_dentry_path(struct dentry *dentry, struct inode *dir,
2736                              const char **ppath, int *ppathlen, u64 *pino,
2737                              bool *pfreepath, bool parent_locked)
2738 {
2739         char *path;
2740
2741         rcu_read_lock();
2742         if (!dir)
2743                 dir = d_inode_rcu(dentry->d_parent);
2744         if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP && !IS_ENCRYPTED(dir)) {
2745                 *pino = ceph_ino(dir);
2746 rcu_read_unlock();
2747                 *ppath = dentry->d_name.name;
2748                 *ppathlen = dentry->d_name.len;
2749                 return 0;
2750         }
2751         rcu_read_unlock();
2752         path = ceph_mdsc_build_path(dentry, ppathlen, pino, 1);
2753         if (IS_ERR(path))
2754                 return PTR_ERR(path);
2755         *ppath = path;
2756         *pfreepath = true;
2757         return 0;
2758 }

In Line#2743, the 'dir' was parsed as "liblz4.so.1.9.3" 's  ino# "1000066d6d8", which is incorrect and it should be the parent dir's ino# "1000066d2e3". And in Line#2747 the "ppath" is "/", which is also incorrect and it should be "liblz4.so.1.9.3".

That means the nfs client passed a invalidate or corrupted old_dentry to kernel ceph. I have no idea how that could happen.

@Frank,

Could you check the nfs client logs ?

Thanks,

- Xiubo

-Greg



I can't reproduce it with simple commands on the NFS client. It seems to
occur only when a large number of files/dirs is created. I can make the
archive available to you if this helps.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Gregory Farnum <gfarnum@xxxxxxxxxx>
Sent: Wednesday, March 22, 2023 4:14 PM
To: Frank Schilder
Cc: ceph-users@xxxxxxx
Subject: Re:  Re: ln: failed to create hard link 'file name':
Read-only file system

Do you have logs of what the nfs server is doing?
Managed to reproduce it in terms of direct CephFS ops?


On Wed, Mar 22, 2023 at 8:05 AM Frank Schilder <frans@xxxxxx<mailto:
frans@xxxxxx>> wrote:
I have to correct myself. It also fails on an export with "sync" mode.
Here is an strace on the client (strace ln envs/satwindspy/include/ffi.h
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h):

[...]
stat("mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h",
0x7ffdc5c32820) = -1 ENOENT (No such file or directory)
lstat("envs/satwindspy/include/ffi.h", {st_mode=S_IFREG|0664,
st_size=13934, ...}) = 0
linkat(AT_FDCWD, "envs/satwindspy/include/ffi.h", AT_FDCWD,
"mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h", 0) = -1 EROFS
(Read-only file system)
[...]
write(2, "ln: ", 4ln: )                     = 4
write(2, "failed to create hard link 'mamb"..., 80failed to create hard
link 'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h') = 80
[...]
write(2, ": Read-only file system", 23: Read-only file system) = 23
write(2, "\n", 1
)                       = 1
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

Has anyone advice?

Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>>
Sent: Wednesday, March 22, 2023 2:44 PM
To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
Subject:  ln: failed to create hard link 'file name':
Read-only file system

Hi all,

on an NFS re-export of a ceph-fs (kernel client) I observe a very strange
error. I'm un-taring a larger package (1.2G) and after some time I get
these errors:

ln: failed to create hard link 'file name': Read-only file system

The strange thing is that this seems only temporary. When I used "ln src
dst" for manual testing, the command failed as above. However, after that I
tried "ln -v src dst" and this command created the hard link with exactly
the same path arguments. During the period when the error occurs, I can't
see any FS in read-only mode, neither on the NFS client nor the NFS server.
Funny thing is that file creation and write still works, its only the
hard-link creation that fails.

For details, the set-up is:

file-server: mount ceph-fs at /shares/path, export /shares/path as nfs4 to
other server
other server: mount /shares/path as NFS

More precisely, on the file-server:

fstab: MON-IPs:/shares/folder /shares/nfs/folder ceph
defaults,noshare,name=NAME,secretfile=sec.file,mds_namespace=FS-NAME,_netdev
0 0
exports: /shares/nfs/folder
-no_root_squash,rw,async,mountpoint,no_subtree_check DEST-IP

On the host at DEST-IP:

fstab: FILE-SERVER-IP:/shares/nfs/folder /mnt/folder nfs defaults,_netdev
0 0

Both, the file server and the client server are virtual machines. The file
server is on Centos 8 stream (4.18.0-338.el8.x86_64) and the client machine
is on AlmaLinux 8 (4.18.0-425.13.1.el8_7.x86_64).

When I change the NFS export from "async" to "sync" everything works.
However, that's a rather bad workaround and not a solution. Although this
looks like an NFS issue, I'm afraid it is a problem with hard links and
ceph-fs. It looks like a race with scheduling and executing operations on
the ceph-fs kernel mount.

Has anyone seen something like that?

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:
ceph-users-leave@xxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:
ceph-users-leave@xxxxxxx>




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Best Regards,

Xiubo Li (李秀波)

Email: xiubli@xxxxxxxxxx/xiubli@xxxxxxx
Slack: @Xiubo Li
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux