Re: ln: failed to create hard link 'file name': Read-only file system

Frank Schilder <frans@xxxxxx> · Mon, 27 Mar 2023 15:22:27 +0000

> Sorry for late.
No worries.

> The ceph qa teuthology test cases have already one similar test, which
> will untar a kernel tarball, but never seen this yet.
>
> I will try this again tomorrow without the NFS client.

Great. In case you would like to use the archive I sent you a link for, please keep it confidential. It contains files not for publication.

I will collect the log information you asked for.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Xiubo Li <xiubli@xxxxxxxxxx>
Sent: Monday, March 27, 2023 4:15 PM
To: Frank Schilder; Gregory Farnum
Cc: ceph-users@xxxxxxx
Subject: Re:  Re: ln: failed to create hard link 'file name': Read-only file system

Frank,

Sorry for late.

On 24/03/2023 01:56, Frank Schilder wrote:
> Hi Xiubo and Gregory,
>
> sorry for the slow reply, I did some more debugging and didn't have too much time. First some questions to collecting logs, but please see also below for reproducing the issue yourselves.
>
> I can reproduce it reliably but need some input for these:
>
>> enabling the kclient debug logs and
> How do I do that? I thought the kclient ignores the ceph.conf and I'm not aware of a mount option to this effect. Is there a "ceph config set ..." setting I can change for a specific client (by host name/IP) and how exactly?
>
$ echo "module ceph +p" > /sys/kernel/debug/dynamic_debug/control

This will enable the debug logs in kernel ceph. Then please provide the
message logs.

>> also the mds debug logs
> I guess here I should set a higher loglevel for the MDS serving this directory (it is pinned to a single rank) or is it something else?

$ ceph daemon mds.X config set debug_mds 25
$ ceph daemon mds.X config set debug_ms 1

>
> The issue seems to require a certain load to show up. I created a minimal tar file mimicking the problem and having 2 directories with a hard link from a file in the first to a new name in the second directory. This does not cause any problems, so its not that easy to reproduce.
>
> How you can reproduce it:
>
> As an alternative to my limited skills of pulling logs out, I make the tgz-archive available to you both. You will receive an e-mail from our one-drive with a download link. If you un-tar the archive on an NFS client dir that's a re-export of a kclient mount, after some time you should see the errors showing up.
>
> I can reliably reproduce these errors on our production- as well as on our test cluster. You should be able to reproduce it too with the tgz file.
>
> Here is a result on our set-up:
>
> - production cluster (executed in a sub-dir conda to make cleanup easy):
>
> $ time tar -xzf ../conda.tgz
> tar: mambaforge/pkgs/libstdcxx-ng-9.3.0-h6de172a_18/lib/libstdc++.so.6.0.28: Cannot hard link to ‘envs/satwindspy/lib/libstdc++.so.6.0.28’: Read-only file system
> [...]
> tar: mambaforge/pkgs/boost-cpp-1.72.0-h9d3c048_4/lib/libboost_log.so.1.72.0: Cannot hard link to ‘envs/satwindspy/lib/libboost_log.so.1.72.0’: Read-only file system
> ^C
>
> real    1m29.008s
> user    0m0.612s
> sys     0m6.870s
>
> By this time there are already hard links created, so it doesn't fail right away:
> $ find -type f -links +1
> ./mambaforge/pkgs/libev-4.33-h516909a_1/share/man/man3/ev.3
> ./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev++.h
> ./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev.h
> ...
>
> - test cluster (octopus latest stable, 3 OSD hosts with 3 HDD OSDs each, simple ceph-fs):
>
> # ceph fs status
> fs - 2 clients
> ==
> RANK  STATE     MDS        ACTIVITY     DNS    INOS
>   0    active  tceph-02  Reqs:    0 /s  1807k  1739k
>    POOL      TYPE     USED  AVAIL
> fs-meta1  metadata  18.3G   156G
> fs-meta2    data       0    156G
> fs-data     data    1604G   312G
> STANDBY MDS
>    tceph-01
>    tceph-03
> MDS version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
>
> Its the new recommended 3-pool layout with fs-data being a 4+2 EC pool.
>
> $ time tar -xzf / ... /conda.tgz
> tar: mambaforge/ssl/cacert.pem: Cannot hard link to ‘envs/satwindspy/ssl/cacert.pem’: Read-only file system
> [...]
> tar: mambaforge/lib/engines-1.1/padlock.so: Cannot hard link to ‘envs/satwindspy/lib/engines-1.1/padlock.so’: Read-only file system
> ^C
>
> real    6m23.522s
> user    0m3.477s
> sys     0m25.792s
>
> Same story here, a large number of hard links has already been created before it starts failing:
>
> $ find -type f -links +1
> ./mambaforge/lib/liblzo2.so.2.0.0
> ...
>
> Looking at the output of find in both cases it also looks a bit non-deterministic when it starts failing.
>
> It would be great if you can reproduce the issue on a similar test setup using the archive conda.tgz. If not, I'm happy to collect any type of logs on our test cluster.
>
> We have now one user who has problems with rsync to an NFS share and it would be really appreciated if this could be sorted.

The ceph qa teuthology test cases have already one similar test, which
will untar a kernel tarball, but never seen this yet.

I will try this again tomorrow without the NFS client.

Thanks

- Xiubo

> Thanks for your help and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Xiubo Li <xiubli@xxxxxxxxxx>
> Sent: Thursday, March 23, 2023 2:41 AM
> To: Frank Schilder; Gregory Farnum
> Cc: ceph-users@xxxxxxx
> Subject: Re:  Re: ln: failed to create hard link 'file name': Read-only file system
>
> Hi Frank,
>
> Could you reproduce it again by enabling the kclient debug logs and also
> the mds debug logs ?
>
> I need to know what exactly has happened in kclient and mds side.
> Locally I couldn't reproduce it.
>
> Thanks
>
> - Xiubo
>
> On 22/03/2023 23:27, Frank Schilder wrote:
>> Hi Gregory,
>>
>> thanks for your reply. First a quick update. Here is how I get ln to work after it failed, there seems no timeout:
>>
>> $ ln envs/satwindspy/include/ffi.h mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h
>> ln: failed to create hard link 'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h': Read-only file system
>> $ ls -l envs/satwindspy/include mambaforge/pkgs/libffi-3.3-h58526e2_2
>> envs/satwindspy/include:
>> total 7664
>> -rw-rw-r--.   1 rit rit    959 Mar  5  2021 ares_build.h
>> [...]
>> $ ln envs/satwindspy/include/ffi.h mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h
>>
>> After an ls -l on both directories ln works.
>>
>> To the question: How can I pull out a log from the nfs server? There is nothing in /var/log/messages.
>>
>> I can't reproduce it with simple commands on the NFS client. It seems to occur only when a large number of files/dirs is created. I can make the archive available to you if this helps.
>>
>> Best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> ________________________________________
>> From: Gregory Farnum <gfarnum@xxxxxxxxxx>
>> Sent: Wednesday, March 22, 2023 4:14 PM
>> To: Frank Schilder
>> Cc: ceph-users@xxxxxxx
>> Subject: Re:  Re: ln: failed to create hard link 'file name': Read-only file system
>>
>> Do you have logs of what the nfs server is doing?
>> Managed to reproduce it in terms of direct CephFS ops?
>>
>>
>> On Wed, Mar 22, 2023 at 8:05 AM Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>> wrote:
>> I have to correct myself. It also fails on an export with "sync" mode. Here is an strace on the client (strace ln envs/satwindspy/include/ffi.h mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h):
>>
>> [...]
>> stat("mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h", 0x7ffdc5c32820) = -1 ENOENT (No such file or directory)
>> lstat("envs/satwindspy/include/ffi.h", {st_mode=S_IFREG|0664, st_size=13934, ...}) = 0
>> linkat(AT_FDCWD, "envs/satwindspy/include/ffi.h", AT_FDCWD, "mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h", 0) = -1 EROFS (Read-only file system)
>> [...]
>> write(2, "ln: ", 4ln: )                     = 4
>> write(2, "failed to create hard link 'mamb"..., 80failed to create hard link 'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h') = 80
>> [...]
>> write(2, ": Read-only file system", 23: Read-only file system) = 23
>> write(2, "\n", 1
>> )                       = 1
>> lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
>> close(0)                                = 0
>> close(1)                                = 0
>> close(2)                                = 0
>> exit_group(1)                           = ?
>> +++ exited with 1 +++
>>
>> Has anyone advice?
>>
>> Thanks!
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>>
>> ________________________________________
>> From: Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>>
>> Sent: Wednesday, March 22, 2023 2:44 PM
>> To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
>> Subject:  ln: failed to create hard link 'file name': Read-only file system
>>
>> Hi all,
>>
>> on an NFS re-export of a ceph-fs (kernel client) I observe a very strange error. I'm un-taring a larger package (1.2G) and after some time I get these errors:
>>
>> ln: failed to create hard link 'file name': Read-only file system
>>
>> The strange thing is that this seems only temporary. When I used "ln src dst" for manual testing, the command failed as above. However, after that I tried "ln -v src dst" and this command created the hard link with exactly the same path arguments. During the period when the error occurs, I can't see any FS in read-only mode, neither on the NFS client nor the NFS server. Funny thing is that file creation and write still works, its only the hard-link creation that fails.
>>
>> For details, the set-up is:
>>
>> file-server: mount ceph-fs at /shares/path, export /shares/path as nfs4 to other server
>> other server: mount /shares/path as NFS
>>
>> More precisely, on the file-server:
>>
>> fstab: MON-IPs:/shares/folder /shares/nfs/folder ceph defaults,noshare,name=NAME,secretfile=sec.file,mds_namespace=FS-NAME,_netdev 0 0
>> exports: /shares/nfs/folder -no_root_squash,rw,async,mountpoint,no_subtree_check DEST-IP
>>
>> On the host at DEST-IP:
>>
>> fstab: FILE-SERVER-IP:/shares/nfs/folder /mnt/folder nfs defaults,_netdev 0 0
>>
>> Both, the file server and the client server are virtual machines. The file server is on Centos 8 stream (4.18.0-338.el8.x86_64) and the client machine is on AlmaLinux 8 (4.18.0-425.13.1.el8_7.x86_64).
>>
>> When I change the NFS export from "async" to "sync" everything works. However, that's a rather bad workaround and not a solution. Although this looks like an NFS issue, I'm afraid it is a problem with hard links and ceph-fs. It looks like a race with scheduling and executing operations on the ceph-fs kernel mount.
>>
>> Has anyone seen something like that?
>>
>> Thanks and best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> --
> Best Regards,
>
> Xiubo Li (李秀波)
>
> Email: xiubli@xxxxxxxxxx/xiubli@xxxxxxx
> Slack: @Xiubo Li
>
--
Best Regards,

Xiubo Li (李秀波)

Email: xiubli@xxxxxxxxxx/xiubli@xxxxxxx
Slack: @Xiubo Li

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx