Re: [PATCH 2/2] ceph: fix coherency issue when truncating file size for fscrypt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/8/22 4:32 AM, Jeff Layton wrote:
On Fri, 2022-04-08 at 03:14 +0800, Xiubo Li wrote:
On 4/7/22 11:38 PM, Jeff Layton wrote:
On Thu, 2022-04-07 at 11:33 -0400, Jeff Layton wrote:
On Thu, 2022-04-07 at 22:41 +0800, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

When truncating the file size the MDS will help update the last
encrypted block, and during this we need to make sure the client
won't fill the pagecaches.

Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
   fs/ceph/inode.c | 7 ++++++-
   1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index f4059d73edd5..cc1829ab497d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2647,9 +2647,12 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
   		req->r_num_caps = 1;
   		req->r_stamp = attr->ia_ctime;
   		if (fill_fscrypt) {
+			filemap_invalidate_lock(inode->i_mapping);
   			err = fill_fscrypt_truncate(inode, req, attr);
-			if (err)
+			if (err) {
+				filemap_invalidate_unlock(inode->i_mapping);
   				goto out;
+			}
   		}
/*
@@ -2660,6 +2663,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c
   		 * it.
   		 */
   		err = ceph_mdsc_do_request(mdsc, NULL, req);
+		if (fill_fscrypt)
+			filemap_invalidate_unlock(inode->i_mapping);
   		if (err == -EAGAIN && truncate_retry--) {
   			dout("setattr %p result=%d (%s locally, %d remote), retry it!\n",
   			     inode, err, ceph_cap_string(dirtied), mask);
Looks reasonable. Is there any reason we shouldn't do this in the non-
encrypted case too? I suppose it doesn't make as much difference in that
case.
We only need this in encrypted case, which will do the RMW for the last
block.


I'll plan to pull this and the other patch into the wip-fscrypt branch.
Should I just fold them into your earlier patches?
Yeah, certainly.
OTOH...do we really need this? I'm not sure I understand the race you're
trying to prevent. Can you lay it out for me?
I am thinking during the RMW for the last block, the page fault still
could happen because the page fault function doesn't prevent that.

And we should prevent it during the RMW is going on.

Right, but the RMW is being done using an anonymous page, and at this
point in the process we haven't really touched the pagecache yet. That
doesn't happen until __ceph_do_pending_vmtruncate.

Most of the callers for filemap_invalidate_lock/_unlock are in the hole
punching codepaths, and not so much in truncate. What outcome are you
trying to prevent with this? Can you lay out the potential race and why
it would be harmful?

Yeah, here I forgot to invalidate the mapping. After writing the dirty pagecache back we should invalidate the mapping and drop the related page too.

It should be:

filemap_invalidate_lock(inode->i_mapping);

write pagecache back;

invalidate the mapping and drop the pages;

do the RMW;

filemap_invalidate_unlock(inode->i_mapping);


As you mentioned in another mail, other processes could do the map read at the same time, and we should make sure that when we are truncating the size, we should block map read to continue and just trigger a page fault and the page fault should wait our truncate size finish ?

-- Xiubo




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux