Re: Hole punch races in Ceph

Jeff Layton <jlayton@xxxxxxxxxx> · Thu, 22 Apr 2021 07:43:16 -0400

On Thu, 2021-04-22 at 13:15 +0200, Jan Kara wrote:
> Hello,
> 
> I'm looking into how Ceph protects against races between page fault and
> hole punching (I'm unifying protection for this kind of races among
> filesystems) and AFAICT it does not. What I have in mind in particular is a
> race like:
> 
> CPU1					CPU2
> 
> ceph_fallocate()
>   ...
>   ceph_zero_pagecache_range()
> 					ceph_filemap_fault()
> 					  faults in page in the range being
> 					  punched
>   ceph_zero_objects()
> 
> And now we have a page in punched range with invalid data. If
> ceph_page_mkwrite() manages to squeeze in at the right moment, we might
> even associate invalid metadata with the page I'd assume (but I'm not sure
> whether this would be harmful). Am I missing something?
> 
> 								Honza

No, I don't think you're missing anything. If ceph_page_mkwrite happens
to get called at an inopportune time then we'd probably end up writing
that page back into the punched range too. What would be the best way to
fix this, do you think?

One idea:

We could lock the pages we're planning to punch out first, then
zero/punch out the objects on the OSDs, and then do the hole punch in
the pagecache? Would that be sufficient to close the race?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>