Re: [PATCH 12/12] dax: New fault locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 11 2016, NeilBrown wrote:

> On Fri, Mar 11 2016, Jan Kara wrote:
>
>> Currently DAX page fault locking is racy.
>>
>> CPU0 (write fault)		CPU1 (read fault)
>>
>> __dax_fault()			__dax_fault()
>>   get_block(inode, block, &bh, 0) -> not mapped
>> 				  get_block(inode, block, &bh, 0)
>> 				    -> not mapped
>>   if (!buffer_mapped(&bh))
>>     if (vmf->flags & FAULT_FLAG_WRITE)
>>       get_block(inode, block, &bh, 1) -> allocates blocks
>>   if (page) -> no
>> 				  if (!buffer_mapped(&bh))
>> 				    if (vmf->flags & FAULT_FLAG_WRITE) {
>> 				    } else {
>> 				      dax_load_hole();
>> 				    }
>>   dax_insert_mapping()
>>
>> And we are in a situation where we fail in dax_radix_entry() with -EIO.
>>
>> Another problem with the current DAX page fault locking is that there is
>> no race-free way to clear dirty tag in the radix tree. We can always
>> end up with clean radix tree and dirty data in CPU cache.
>>
>> We fix the first problem by introducing locking of exceptional radix
>> tree entries in DAX mappings acting very similarly to page lock and thus
>> synchronizing properly faults against the same mapping index. The same
>> lock can later be used to avoid races when clearing radix tree dirty
>> tag.
>
> Hi,
>  I think the exception locking bits look good - I cannot comment on the
>  rest.
>  I looks like it was a good idea to bring the locking into dax.c instead
>  of trying to make it generic.
>

Actually ... I'm still bothered by the exclusive waiting.  If an entry
is locked and there are two threads in dax_pfn_mkwrite() then one would
be woken up when the entry is unlocked and it will just set the TAG_DIRTY
flag and then continue without ever waking the next waiter on the
wait queue.

I *think* that any thread which gets an exclusive wakeup is responsible
for performing another wakeup.  In this case it must either lock the
slot, or call __wakeup.
That means:
  grab_mapping_entry needs to call wakeup:
     if radix_tree_preload() fails
     if radix_tree_insert fails other than with -EEXIST
     if a valid page was found
  dax_delete_mapping_entry needs to call wakeup
     if the fail case, though as that isn't expect (WARN_ON_ONCE)
        it should be a problem not to wakeup here
  dax_pfn_mkwrite needs to call wakeup unconditionally


Am I missing something?

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux