Re: kernel locks due to USB I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/11/20 10:51 AM, Alan Stern wrote:
On Tue, Nov 10, 2020 at 06:42:17PM -0500, Alberto Sentieri wrote:
1) The current Ubuntu Kernel is 5.4.0-53. Do you want me to upgrade it to
5.9, from kernel.org? Or is there a Ubuntu 5.9 package that I can use? It
would be easy to do it If there is a Ubuntu package with 5.9, which I would
install and, after the tests, uninstall.
If you want to know what Ubuntu packages are available, you should ask
on an Ubuntu mailing list instead of the linux-usb mailing list.
I am sorry to be asking about Ubuntu. For some reason I imagined I was exchanging emails with Ubuntu guys, but now I understand that you are from the kernel.org.

2) Why do you believe that 5.9 would solve the problem? I am asking that
because I cannot change the production machine for a test if I cannot go
back to the original state. There is always a risk involved.
We do not believe that 5.9 will solve the problem -- we have no reason
to believe this -- but we could be wrong.  In any case it is always
best to test with the most up-to-date software available, and 5.9 is the
version closest to what we are working on now.
I will try kernel 5.9 soon...

3) It is one single thread dealing with all 36 devices. Each device has its
own co-routine (not preemptive), but all co-routines are executed by a
unique thread.
If everything runs within a single thread, how can adding a semaphore
or mutex make any difference?
The semaphore will block a co-routine, not a thread. It is not the type of semaphore C programmers are used to. So, before the introduction of the semaphore, a sequence like that would happen:

request packet device 1 URB submit
request packet device 2 URB submit
...
request packet device 36 URB submit
wait on epoll after submitting 36 URBs, one for each device.
reap URBs, receive response packets, send confirmation packets (basically run the state machine, each device has its own state)

After the semaphore, a sequence like that would happen:

lock the semaphore
send request packet device 1 (URB submit)
wait on epoll
reap URB with device 1 response packet
submit URB with device 1 confirmation
wait on epoll
reap URB submitted on last step
unlock the semaphore
Now go to the next device, which was waiting on the co-routine semaphore.

The main difference is that I would not submit 36 URBs to 36 different devices at the same time. The submission of 36 URBs would make the devices start responding as soon as they get ready and receive a pool.



4) By network console, do you mean ssh? It dies as well when it locks. The
screen is the regular GNOME3 screen and nothing can be seen there. Every
time it locks they send a picture, and I cannot see anything meaningful
there. I am thinking about disabling GNOME3, but I need their blessing for
that.
See https://www.kernel.org/doc/Documentation/networking/netconsole.txt
for instructions on netconsole.  And when you use it for testing, be
sure to set the console log level to a high value.

Alan Stern


I will try kernel 5.9. However, it seems that the 5.3 kernel gets lost when too many submits / reaps start happening very close to each other.

Thanks,

Alberto Sentieri




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux