Dear Ubuntu Team, We have analyzed the issue, but due to our limited time and ability to create a fix, we are unable to submit a patch directly. However, we can provide some ideas to assist you in generating a fix that we can then test. I have encountered a race condition issue in the RT2X00 driver, specifically related to the function rt2x00usb_work_rxdone. The issue manifests as a kernel NULL pointer dereference, which causes the system to crash. Below is the detailed analysis and my suggestions for addressing the issue. Problem Analysis The kernel panic log indicates that the crash occurs due to a NULL pointer dereference at the following location: [ 371.258315] BUG: kernel NULL pointer dereference, address: 0000000000000038 [ 371.258339] CPU: 8 PID: 144 Comm: kworker/u40:2 Not tainted 6.8.0-40-generic #40~22.04.2-Ubuntu [ 371.258346] Workqueue: phy23 rt2x00usb_work_rxdone [rt2x00usb] The root cause appears to be a race condition where multiple threads may simultaneously access and modify shared resources without proper synchronization. Specifically, it seems that the pointer being accessed in rt2x00usb_work_rxdone is not consistently initialized before being used, leading to the NULL pointer dereference. Suggestions for Fix Introduce Locking Mechanisms: To prevent concurrent access to shared resources, I recommend introducing locking mechanisms such as spinlock or mutex. This would ensure that only one thread can access the critical section at a time, thereby avoiding race conditions. Pointer Validity Check: Before dereferencing any pointer, it's essential to check whether the pointer is valid (i.e., not NULL). If the pointer is invalid, the function should safely return without proceeding further. Retry and Delay Mechanism: If a critical resource is not yet initialized or is in an unexpected state, implementing a retry mechanism with delays could help avoid crashes. Additionally, more debug information should be logged in case of failure to assist in diagnosing the issue. Code Review: A comprehensive code review focusing on areas where hardware resources and multithreading operations intersect could reveal other potential race conditions. Identifying and addressing these issues proactively would enhance the driver’s robustness. Example Code Snippet While I cannot provide a complete patch, here is an example of how the suggested changes could be implemented: void rt2x00usb_work_rxdone(struct work_struct *work) { struct rt2x00_dev *rt2x00dev = container_of(work, struct rt2x00_dev, rxdone_work); unsigned long flags; void *data; // Lock to protect shared resources spin_lock_irqsave(&rt2x00dev->irq_lock, flags); data = rt2x00usb_get_rx_data(rt2x00dev); if (!data) { // Unlock and return if data is not valid spin_unlock_irqrestore(&rt2x00dev->irq_lock, flags); return; } // Process the data ... // Unlock after processing spin_unlock_irqrestore(&rt2x00dev->irq_lock, flags); } This snippet shows how to introduce a spinlock to protect shared resources and ensure that the pointer is valid before dereferencing it. Conclusion In conclusion, the race condition in the RT2X00 driver is likely caused by insufficient synchronization between threads. By adding proper locking mechanisms, pointer validity checks, and retry mechanisms, this issue can be mitigated. I hope these suggestions will assist in resolving the problem. If you require further assistance or additional information Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> 于2024年8月20日周二 01:43写道: > > On Mon, Aug 19, 2024 at 11:11:10PM +0800, color Ice wrote: > > On some TP-Link routers or routers running OpenWrt, as well as Raspberry Pi > > devices with a headless setup and BeagleBone boards, certain USB > > configurations are required by default. These devices typically grant > > higher permissions to USB by default. Therefore, on certain devices, I can > > run a PoC without using sudo. This explains why there are some inherent > > risk scenarios when declaring this vulnerability, as there are many Linux > > distributions applied to different embedded devices. > > I suggest filing bugs with those distros/system images so that they > properly remove the ability for users to reset any random USB device > this way. If any user can disconnect any driver from any device, that's > not a good system... > > Also, why not dig into the code and try to come up with a fix while > waiting? The code is all there for everyone to read and resolve, that > way you get the proper credit for fixing the issue as well. > > thanks, > > greg k-h