Hi Charles, as far as I know, CephFS implements POSIX semantics. That is, if the CephFS server cluster dies for whatever reason then this will translate in I/O errors. This is the same as if your NFS server dies or you run the program locally on a workstation/laptop and the machine loses power. POSIX file systems guarantee that data is persisted on the storage after a file is closed or fsync() is called. Otherwise, the data may still be "in flight", e.g., in the OS I/O cache or even the runtime library's cache. This is not a bug but a feature as this improves performance when appending small bits to a file and the HDD head does not have to move every time something is written and not a full 4kb block has to be written for SSD. Posix semantics even go further, enforcing certain guarantees if files are written from multiple clients. Recently, something called "lazy I/O" has been introduced [1] in CephFS which allows to explicitly relax certain of these guarantees to improve performance. I don't think there even is a ceph mount setting that allows you to configure local cache mechanisms as for NFS. For NFS, I have seen setups where two clients saw two different versions of the same -- closed -- file because one had written to the file and this was not yet reflected on the second client. To the best of my knowledge, this will not happen with CephFS. I'd be happy to learn to be wrong if I'm wrong. ;-) Best wishes, Manuel [1] https://docs.ceph.com/en/latest/cephfs/lazyio/ On Thu, Dec 8, 2022 at 5:09 PM Charles Hedrick <hedrick@xxxxxxxxxxx> wrote: > thanks. I'm evaluating cephfs for a computer science dept. We have users > that run week-long AI training jobs. They use standard packages, which they > probably don't want to modify. At the moment we use NFS. It uses > synchronous I/O, so if somethings goes wrong, the users' jobs pause until > we reboot, and then continue. However there's an obvious performance > penalty for this. > ________________________________ > From: Gregory Farnum <gfarnum@xxxxxxxxxx> > Sent: Thursday, December 8, 2022 2:08 AM > To: Dhairya Parmar <dparmar@xxxxxxxxxx> > Cc: Charles Hedrick <hedrick@xxxxxxxxxxx>; ceph-users@xxxxxxx < > ceph-users@xxxxxxx> > Subject: Re: Re: what happens if a server crashes with cephfs? > > More generally, as Manuel noted you can (and should!) make use of fsync et > al for data safety. Ceph’s async operations are not any different at the > application layer from how data you send to the hard drive can sit around > in volatile caches until a consistency point like fsync is invoked. > -Greg > > On Wed, Dec 7, 2022 at 10:02 PM Dhairya Parmar <dparmar@xxxxxxxxxx<mailto: > dparmar@xxxxxxxxxx>> wrote: > Hi Charles, > > There are many scenarios where the write/close operation can fail but > generally > failures/errors are logged (normally every time) to help debug the case. > Therefore > there are no silent failures as such except you encountered a very rare > bug. > - Dhairya > > > On Wed, Dec 7, 2022 at 11:38 PM Charles Hedrick <hedrick@xxxxxxxxxxx > <mailto:hedrick@xxxxxxxxxxx>> wrote: > > > I believe asynchronous operations are used for some operations in cephfs. > > That means the server acknowledges before data has been written to stable > > storage. Does that mean there are failure scenarios when a write or close > > will return an error? fail silently? > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto: > ceph-users-leave@xxxxxxx> > > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> > To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto: > ceph-users-leave@xxxxxxx> > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx