Re: [PATCH] ceph: do not truncate pagecache if truncate size doesn't change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/17/21 9:28 PM, Jeff Layton wrote:
On Wed, 2021-11-17 at 09:21 +0800, Xiubo Li wrote:
On 11/17/21 4:06 AM, Jeff Layton wrote:
On Tue, 2021-11-16 at 17:20 +0800, xiubli@xxxxxxxxxx wrote:
[...]
So when filling the inode it will truncate the pagecache by using
truncate_sizeA again, which makes no sense and will trim the inocent
pages.

Is there a reproducer for this? It would be nice to put something in
xfstests for it if so.
In xfstests' generic/075 has already testing this, but i didn't see any
issue it reproduce.

I just found this strange logs when it's doing
something like:

truncateA 0x10000 --> 0x2000

truncateB 0x2000   --> 0x8000

truncateC 0x8000   --> 0x6000

For the truncateC, the log says:

ceph:  truncate_size 0x2000 -> 0x6000


The problem is that the truncateB will also do the vmtruncate by using
the 0x2000 instead, the vmtruncate will not flush the dirty pages to the
OSD and will just discard them from the pagecaches. Then we may lost
some new updated data in case there has any write before the truncateB
in range [0x2000, 0x8000).


Is that reproducible without the fscrypt size handling changes? I
haven't seen generic/075 fail on stock kernels.

Yeah, there is no error about this, since there has no any write between truncateA and truncateB, if there has I am afraid the test will fail in theory.



If this is a generic bug, then we should go ahead and fix it in
mainline. If it's a problem only with fscrypt enabled, then let's plan
to roll this patch into those changes.

I think it makes sense always to check the truncate_seq and truncate_size here in kclient, or at least should we warn it in case the MDS will do this again in future ?

The truncate_seq and truncate_size should always be changed at the same time.

Make sense ?




Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
   fs/ceph/inode.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 1b4ce453d397..b4f784684e64 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -738,10 +738,11 @@ int ceph_fill_file_size(struct inode *inode, int issued,
   			 * don't hold those caps, then we need to check whether
   			 * the file is either opened or mmaped
   			 */
-			if ((issued & (CEPH_CAP_FILE_CACHE|
+			if (ci->i_truncate_size != truncate_size &&
+			    ((issued & (CEPH_CAP_FILE_CACHE|
   				       CEPH_CAP_FILE_BUFFER)) ||
   			    mapping_mapped(inode->i_mapping) ||
-			    __ceph_is_file_opened(ci)) {
+			    __ceph_is_file_opened(ci))) {
   				ci->i_truncate_pending++;
   				queue_trunc = 1;
   			}




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux