Re: [PATCH] ovl: check VM_DENYWRITE mappings in copy-up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 13, 2021 at 5:26 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
>
>  ---- 在 星期四, 2021-04-08 23:03:39 Miklos Szeredi <miklos@xxxxxxxxxx> 撰写 ----
>  > On Thu, Apr 8, 2021 at 1:40 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
>  > >
>  > >  ---- 在 星期四, 2021-04-08 19:29:55 Miklos Szeredi <miklos@xxxxxxxxxx> 撰写 ----
>  > >  > On Thu, Apr 8, 2021 at 1:28 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
>  > >  > >
>  > >  > >  ---- 在 星期四, 2021-04-08 19:20:42 Chengguang Xu <cgxu519@xxxxxxxxxxxx> 撰写 ----
>  > >  > >  > In overlayfs copy-up, if open flag has O_TRUNC then upper
>  > >  > >  > file will truncate to zero size, in this case we should check
>  > >  > >  > VM_DENYWRITE mappings to keep compatibility with other filesystems.
>  > >  >
>  > >  > Can you provide a test case for the bug that this is fixing?
>  > >  >
>  > >
>  > > Execute binary file(keep running until open) in overlayfs which only has lower && open the binary file with flag O_RDWR|O_TRUNC
>  > >
>  > > Expected result: open fail with -ETXTBSY
>  > >
>  > > Actual result: open success
>  >
>  > Worse,  it's possible to get a "Bus error" with just execute and write
>  > on an overlayfs file, which i_writecount is supposed to protect.
>  >
>  > The reason is that the put_write_access() call in __vma_link_file()
>  > assumes an already negative writecount, but because of the vm_file
>  > shuffle in ovl_mmap() that's not guaranteed.   There's even a comment
>  > about exactly this situation in mmap():
>  >
>  > /* ->mmap() can change vma->vm_file, but must guarantee that
>  > * vma_link() below can deny write-access if VM_DENYWRITE is set
>  > * and map writably if VM_SHARED is set. This usually means the
>  > * new file must not have been exposed to user-space, yet.
>  > */
>  >
>  > The attached patch fixes this, but not your original bug.
>  >
>  > That could be addressed by checking the writecount on *both* lower and
>  > upper for open for write/truncate.  Note: this could be checked before
>  > copy-up, but that's not reliable alone, because the copy up could
>  > happen due to meta-data update, for example, and then the
>  > open/truncate wouldn't trigger the writecount check.
>  >
>  > Something like the second attached patch?
>  >
>
> Yeah, I noticed that too just after posted my previous patch.
> However, rethink these two cases, in practice we share lower layers
> in most use cases especially in container use case. So if we check
> VM_DENYWRITE of lower file strictly, it may cause interferes between
> container instances. Maybe only checking upper file will be better
> option?

Yes.

My patch to fix the SIGBUS is also incomplete as there's still a race
window between releasing the temporary writecount and the __vma_link()
that acquires the final count.    This requires major surgery to fix
properly :(

Thanks,
Miklos




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux