I just raised one tracker to follow this:
https://tracker.ceph.com/issues/63510
Thanks
- Xiubo
On 11/10/23 22:53, Frank Schilder wrote:
It looks like the cap update request was dropped to the ground in MDS.
[...]
If you can reproduce it, then please provide the mds logs by setting:
[...]
I can do a test with MDS logs on high level. Before I do that, looking at the python
findings above, is this something that should work on ceph or is it a python issue?
Not sure yet. I need to understand what exactly shutil.copy does in kclient.
Thanks! Will wait for further instructions.
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Xiubo Li <xiubli@xxxxxxxxxx>
Sent: Friday, November 10, 2023 3:14 AM
To: Frank Schilder; Gregory Farnum
Cc: ceph-users@xxxxxxx
Subject: Re: Re: ceph fs (meta) data inconsistent
On 11/10/23 00:18, Frank Schilder wrote:
Hi Xiubo,
I will try to answer questions from all your 3 e-mails here together with some new information we have.
New: The problem occurs in newer python versions when using the shutil.copy function. There is also a function shutil.copy2 for which the problem does not show up. Copy2 behaves a bit like "cp -p" while copy is like "cp". The only code difference (linux) between these 2 functions is that copy calls copyfile+copymode while copy2 calls copyfile+copystat. For now we asked our users to use copy2 to avoid the issue.
The copyfile function calls _fastcopy_sendfile on linux, which in turn calls os.sendfile, which seems to be part of libc:
#include <sys/sendfile.h>
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);
I'm wondering if using this function requires explicit meta-data updates or should be safe on ceph-fs. I'm also not sure if a user-space client even supports this function (seems to be meaningless). Should this function be safe to use on ceph kclient?
I didn't foresee any limit for this in kclient.
The shutil.copy will only copy the contents of the file, while the
shutil.copy2 will also copy the metadata. I need to know what exactly
they do in kclient for shutil.copy and shutil.copy2.
Answers to questions:
BTW, have you test the ceph-fuse with the same test ? Is also the same ?
I don't have fuse clients available, so can't test right now.
Have you tried other ceph version ?
We are in the process of deploying a new test cluster, the old one is scrapped already. I can't test this at the moment.
It looks like the cap update request was dropped to the ground in MDS.
[...]
If you can reproduce it, then please provide the mds logs by setting:
[...]
I can do a test with MDS logs on high level. Before I do that, looking at the python findings above, is this something that should work on ceph or is it a python issue?
Not sure yet. I need to understand what exactly shutil.copy does in kclient.
Thanks
- Xiubo
Thanks for your help!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx