Re: [PATCH v2] refscale: Fix use of uninitalized wait_queue_head_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 07, 2023 at 01:53:55PM -0400, Waiman Long wrote:
> It was found that running the refscale test might crash the kernel once
> in a while with the following error:
> 
> [ 8569.952896] BUG: unable to handle page fault for address: ffffffffffffffe8
> [ 8569.952900] #PF: supervisor read access in kernel mode
> [ 8569.952902] #PF: error_code(0x0000) - not-present page
> [ 8569.952904] PGD c4b048067 P4D c4b049067 PUD c4b04b067 PMD 0
> [ 8569.952910] Oops: 0000 [#1] PREEMPT_RT SMP NOPTI
> [ 8569.952916] Hardware name: Dell Inc. PowerEdge R750/0WMWCR, BIOS 1.2.4 05/28/2021
> [ 8569.952917] RIP: 0010:prepare_to_wait_event+0x101/0x190
>   :
> [ 8569.952940] Call Trace:
> [ 8569.952941]  <TASK>
> [ 8569.952944]  ref_scale_reader+0x380/0x4a0 [refscale]
> [ 8569.952959]  kthread+0x10e/0x130
> [ 8569.952966]  ret_from_fork+0x1f/0x30
> [ 8569.952973]  </TASK>
> 
> This is likely caused by the fact that init_waitqueue_head() is
> called after the ref_scale_reader kthread is created. The kthread
> can potentially try to use the waitqueue head before it is properly
> initialized. The crash happened at
> 
> 	static inline void __add_wait_queue(...)
> 	{
> 		:
> 		if (!(wq->flags & WQ_FLAG_PRIORITY)) <=== Crash here
> 
> The offset of flags from list_head entry in wait_queue_entry is -0x18. If
> reader_tasks[i].wq.head.next is NULL as allocated reader_task structure
> is zero initialized, the instruction will try to access address
> 0xffffffffffffffe8 which is the fault address listed above.
> 
> Fix this by initializing the waitqueue head first before kthread
> creation.
> 
> Fixes: 653ed64b01dc ("refperf: Add a test to measure performance of read-side synchronization")
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
> Reviewed-by: Davidlohr Bueso <dave@xxxxxxxxxxxx>

Queued and pushed, thank you all!

As always, I could not resist wordsmithing the commit log, please see
below.

							Thanx, Paul

------------------------------------------------------------------------

commit 933d3bf8f96d7cedf78081030e004d23aee2b56c
Author: Waiman Long <longman@xxxxxxxxxx>
Date:   Fri Jul 7 13:53:55 2023 -0400

    refscale: Fix uninitalized use of wait_queue_head_t
    
    Running the refscale test occasionally crashes the kernel with the
    following error:
    
    [ 8569.952896] BUG: unable to handle page fault for address: ffffffffffffffe8
    [ 8569.952900] #PF: supervisor read access in kernel mode
    [ 8569.952902] #PF: error_code(0x0000) - not-present page
    [ 8569.952904] PGD c4b048067 P4D c4b049067 PUD c4b04b067 PMD 0
    [ 8569.952910] Oops: 0000 [#1] PREEMPT_RT SMP NOPTI
    [ 8569.952916] Hardware name: Dell Inc. PowerEdge R750/0WMWCR, BIOS 1.2.4 05/28/2021
    [ 8569.952917] RIP: 0010:prepare_to_wait_event+0x101/0x190
      :
    [ 8569.952940] Call Trace:
    [ 8569.952941]  <TASK>
    [ 8569.952944]  ref_scale_reader+0x380/0x4a0 [refscale]
    [ 8569.952959]  kthread+0x10e/0x130
    [ 8569.952966]  ret_from_fork+0x1f/0x30
    [ 8569.952973]  </TASK>
    
    The likely cause is that init_waitqueue_head() is called after the call to
    the torture_create_kthread() function that creates the ref_scale_reader
    kthread.  Although this init_waitqueue_head() call will very likely
    complete before this kthread is created and starts running, it is
    possible that the calling kthread will be delayed between the calls to
    torture_create_kthread() and init_waitqueue_head().  In this case, the
    new kthread will use the waitqueue head before it is properly initialized,
    which is not good for the kernel's health and well-being.
    
    The above crash happened here:
    
            static inline void __add_wait_queue(...)
            {
                    :
                    if (!(wq->flags & WQ_FLAG_PRIORITY)) <=== Crash here
    
    The offset of flags from list_head entry in wait_queue_entry is
    -0x18. If reader_tasks[i].wq.head.next is NULL as allocated reader_task
    structure is zero initialized, the instruction will try to access address
    0xffffffffffffffe8, which is exactly the fault address listed above.
    
    This commit therefore invokes init_waitqueue_head() before creating
    the kthread.
    
    Fixes: 653ed64b01dc ("refperf: Add a test to measure performance of read-side synchronization")
    Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
    Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
    Reviewed-by: Davidlohr Bueso <dave@xxxxxxxxxxxx>
    Acked-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
    Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
index 1970ce5f22d4..71d138573856 100644
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -1107,12 +1107,11 @@ ref_scale_init(void)
 	VERBOSE_SCALEOUT("Starting %d reader threads", nreaders);
 
 	for (i = 0; i < nreaders; i++) {
+		init_waitqueue_head(&reader_tasks[i].wq);
 		firsterr = torture_create_kthread(ref_scale_reader, (void *)i,
 						  reader_tasks[i].task);
 		if (torture_init_error(firsterr))
 			goto unwind;
-
-		init_waitqueue_head(&(reader_tasks[i].wq));
 	}
 
 	// Main Task



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux