Re: [v3 2/3] mm: Defer TLB flush by keeping both src and dst folios at migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Nov 8, 2023, at 6:12 AM, Byungchul Park <byungchul@xxxxxx> wrote:
> 
> !! External Email
> 
> On Mon, Oct 30, 2023 at 09:51:30PM +0900, Byungchul Park wrote:
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index 6c264d2f969c..75dc48b6e15f 100644
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -3359,6 +3359,19 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>>>>  if (vmf->page)
>>>>          folio = page_folio(vmf->page);
>>>> 
>>>> + /*
>>>> +  * This folio has its read copy to prevent inconsistency while
>>>> +  * deferring TLB flushes. However, the problem might arise if
>>>> +  * it's going to become writable.
>>>> +  *
>>>> +  * To prevent it, give up the deferring TLB flushes and perform
>>>> +  * TLB flush right away.
>>>> +  */
>>>> + if (folio && migrc_pending_folio(folio)) {
>>>> +         migrc_unpend_folio(folio);
>>>> +         migrc_try_flush_free_folios(NULL);
>>> 
>>> So many potential function calls… Probably they should have been combined
>>> into one and at least migrc_pending_folio() should have been an inline
>>> function in the header.
>> 
>> I will try to change it as you mention.
>> 
>>>> + }
>>>> +
>>> 
>>> What about mprotect? I thought David has changed it so it can set writable
>>> PTEs.
>> 
>> I will check it out.
> 
> I found mprotect stuff is already performing TLB flushes needed for it.
> So some redundant TLB flushes might happen by migrc but it's not that
> harmful I think. Thanks.

Let me explain the scenario I am concerned with. Assume page P is RO, and
moves from Psrc to Pdst. Pointer “p” points to P. Initially (*p == 0).

Let’s also assume we also have an atomic variable “a”. Initially (a == 0).

I hope I got the migration function names right, but I hope the problem
itself can be clear regardless. 

CPU0			CPU1			CPU2		CPU3
----			----			----		----
			(user-mode)		(user-mode)		

			Access *p
			[Psrc cached in TLB]
 
migrate_pages_batch()
-> migrate_folio_unmap()

[ PTE updated, 
  still no flush ]

								mprotect(p,
									RW)

								[ Psrc is
								  RW ]

								[ flush
								  deferred]


						*p = 1  # Pdst
						
						xchg(&a, 1)
			mfence
			if (a == 1)
			  assert(*p == 1);


				
Now at this point the assertion might fail. CPU2 wrote into Pdst, whereas
CPU1 reads from Psrc. But based on x86 memory model, userspace might not
expect this scenario to be possible, hence leading to bugs.










[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux