DAX: __dax_fault race question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I try to understand locking rules for dax and realized that there is
some suspicious case in dax_fault

On __dax_fault we try to replace normal page with dax-entry
Basically dax_fault steps looks like follows

1) page = find_get_page(..)
2) lock_page_or_retry(page)
3) get_block
4) delete_from_page_cache(page)
5) unlock_page(page)
6) dax_insert_mapping(inode, &bh, vma, vmf)
...

But what protects us from other taks does new page_fault after (4) but
before (6).
AFAIU this case is not prohibited
Let's see what happens for two read/write tasks does fault inside file-hole
task_1(writer)                  task_2(reader)
__dax_fault(write)
  ->lock_page_or_retry
  ->delete_from_page_cache()    __dax_fault(read)
                                ->dax_load_hole
                                  ->find_or_create_page()
                                    ->new page in mapping->radix_tree               
  ->dax_insert_mapping
     ->dax_radix_entry->collision: return -EIO

Before dax/fsync patch-set this race result in silent dax/page duality(which
likely result data incoherence or data corruption), Luckily now this
race result in collision on insertion to radix_tree and return -EIO.
From first glance testcase looks very simple, but I can not reproduce
this in my environment. 

Imho it is reasonable pass locked page to dax_insert_mapping and let
dax_radix_entry use atomic page/dax-entry replacement similar to
replace_page_cache_page. Am I right?

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]