Re: Rogue EXDEV errors when hardlinking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Domhnall,

----- Le 20 Mar 25, à 17:45, Domhnall McGuigan dmcguigan@xxxxxx a écrit :

> Hi all, we've been seeing persistent problems when trying to create hardlinks on
> cephfs; it's returning EXDEV in a way that makes no sense given typical POSIX
> behaviour and ceph documentation. Here's a typical strace of the problem:
> 
>        78    13:47:26.572435 link("/data/db/hdb/data/2023.08.06/table1.0/column1",
>        "/data/db/hdb/data/2023.08.06/table1.1/column1") = -1 EXDEV (Invalid
>        cross-device link)
>        78    13:47:26.577661 write(1,
>        "{\"time\":\"2025-03-03T13:47:26.577z\",\"component\":\"MSVC\",\"level\":\"INFO\",\"message\":\"[eoi-78]
>        Retrying in 500 milliseconds\",\"service\":\"eoi\"}\n", 136) = 136
>        78    13:47:26.577762 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0,
>        tv_nsec=500000000}, NULL) = 0
>        78    13:47:27.078037 link("/data/db/hdb/data/2023.08.06/table1.0/column1",
>        "/data/db/hdb/data/2023.08.06/table1.1/column1") = 0
> 
> We try creating a link, get EXDEV, wait 500 milliseconds, then try the same
> operation again and it succeeds. The link and its target are both on the same
> cephfs mount (/data/db/hdb in this case), so the normal POSIX 'linking between
> filesystems' explanation doesn't apply.  I've looked through the ceph client
> and server code and from what I've seen EXDEV is only returned in a couple of
> other situations: linking between snapshots, and linking across quotas. Neither
> snapshots nor quotas were in use here, and if they were the culprit it seems
> unlikely the automatic retry would have worked. Web searches on EXDEV errors in
> ceph have also proven to be a dead end. My best guess, although it's not a very
> good one, is that stale MDS cache data is somehow involved -- in one case the
> issue reportedly got much worse after increasing (!) the MDS memory limit.
> 
> This error has been occurring for a particular client for upwards of 9 months
> and has proven stubbornly resistant to reproduction elsewhere (we are working
> on migrating them to a more recent ceph version to see if the error remains),

You mean Kernel version right? You're not using ceph-fuse to mount the filesystem, are you?

A quick 'cephfs EXDEV' search points to ceph-fuse and/or quotas [1][2]. Are you using any of these?

Frédéric.

[1] https://ceph-users.ceph.narkive.com/XW20WeeF/cephfs-move-operation
[2] https://www.spinics.net/lists/ceph-users/msg67823.html

> so our technical investigations haven't got particularly far. I was hoping
> someone here on ceph-users would have seen similar EXDEV errors in the wild or
> in development and have some insight into what could be causing them.
> 
> Regards, Domhnall
> ***********************************************************************************************************************************************************************
> This email, its contents and any files attached are a confidential communication
> and are intended only for the named addressees indicated in the message. If you
> are not the named addressee or if you have received this email in error, you
> may not, without the consent of KX, copy, use or rely on any information or
> attachments in any way. Please notify the sender by return email and delete it
> from your email system.
> Unless separately agreed, KX does not accept any responsibility for the accuracy
> or completeness of the contents of this email or its attachments. Please note
> that any views, opinion or advice contained in this communication are those of
> the sending individual and not those of KX and KX shall have no liability
> whatsoever in relation to this communication (or its content) unless separately
> agreed.
> ***********************************************************************************************************************************************************************
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux