Alan,
I am not sure if that would give you any clue, but I enabled usbfs_snoop
in the kernel and I got URB error -108 (endpoint shutdown) just before
the kernel lock.
Just after starting the program, I got something like that:
[ 174.325977] usb 1-1.1.1: opened by process 2961: ate
[ 174.326006] usb 1-1.1.1: usbdev_do_ioctl: GETDRIVER
[ 174.326015] usb 1-1.1.1: usbdev_do_ioctl: IOCTL
[ 174.326334] usb 1-1.1.1: usbdev_do_ioctl: CLAIMINTERFACE
[ 174.326371] usb 1-1.1.1: usbdev_do_ioctl: SETINTERFACE
[ 174.326582] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326592] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326617] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326623] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326629] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326635] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326640] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326645] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326652] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326657] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326663] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326668] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326674] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326679] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326685] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326691] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326696] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326702] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326708] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326713] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326718] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326723] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326729] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326734] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326740] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326745] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326751] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326756] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326762] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326767] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.326773] usb 1-1.1.1: usbdev_do_ioctl: SUBMITURB
[ 174.326783] usb 1-1.1.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.327565] usb 1-1.3.1: opened by process 2961: ate
[ 174.327584] usb 1-1.3.1: usbdev_do_ioctl: GETDRIVER
[ 174.327593] usb 1-1.3.1: usbdev_do_ioctl: IOCTL
[ 174.327855] usb 1-1.3.1: usbdev_do_ioctl: CLAIMINTERFACE
[ 174.327881] usb 1-1.3.1: usbdev_do_ioctl: SETINTERFACE
[ 174.328180] usb 1-1.3.1: usbdev_do_ioctl: SUBMITURB
[ 174.328188] usb 1-1.3.1: userurb 0000000000000000, ep1 int-in, length 64
[ 174.328205] usb 1-1.3.1: usbdev_do_ioctl: SUBMITURB
During the program execution, I got many lines like that:
[ 178.537958] usb 1-1.4.1: urb complete
[ 178.537975] usb 1-1.4.1: userurb pK-error, ep1 int-out,
actual_length 64 status 0
[ 178.538000] usb 1-1.2.1: urb complete
[ 178.538006] usb 1-1.2.1: userurb pK-error, ep1 int-out,
actual_length 64 status 0
I am not sure if pK-error has a special meaning there.
However, just before the lock, I got this sequence:
[ 387.326795] usb 1-1.5.3.1: usbdev_do_ioctl: SUBMITURB
[ 387.326801] usb 1-1.5.3.1: userurb 0000000000000000, ep1 int-in,
length 64
[ 387.327531] usb 1-1.5.3.1: urb complete
[ 387.327539] usb 1-1.5.3.1: userurb pK-error, ep1 int-out,
actual_length 64 status 0
[ 387.327582] usb 1-1.5.3.1: usbdev_do_ioctl: REAPURBNDELAY
[ 387.327590] usb 1-1.5.3.1: reap 0000000000000000
[ 387.327598] usb 1-1.5.3.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.109400] usb 3-1.5.7.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.110032] usb 3-1.6.7.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.110169] usb 3-1.5.5.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.110605] usb 3-1.6.5.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.111394] usb 3-1.1.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.403278] usb 3-1.6.4.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.404011] usb 3-1.5.1.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.404344] usb 3-1.6.3.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.404649] usb 3-1.4.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.404790] usb 3-1.3.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.405283] usb 3-1.5.3.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.407828] usb 3-1.6.1.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.408771] usb 3-1.5.4.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.409155] usb 3-1.5.2.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.409452] usb 3-1.5.6.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.409743] usb 3-1.6.6.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.409848] usb 3-1.2.1: usbdev_do_ioctl: REAPURBNDELAY
[ 388.410840] usb 3-1.6.2.1: usbdev_do_ioctl: REAPURBNDELAY
[ 389.612597] usb 3-1.5.7.1: opened by process 2961: ate
[ 389.612622] usb 3-1.5.7.1: usbdev_do_ioctl: GETDRIVER
[ 389.612631] usb 3-1.5.7.1: usbdev_do_ioctl: IOCTL
[ 389.614965] usb 3-1.5.7.1: urb complete
[ 389.614981] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.614989] usb 3-1.5.7.1: urb complete
[ 389.614994] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.614999] usb 3-1.5.7.1: urb complete
[ 389.615003] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615008] usb 3-1.5.7.1: urb complete
[ 389.615013] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615018] usb 3-1.5.7.1: urb complete
[ 389.615022] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615027] usb 3-1.5.7.1: urb complete
[ 389.615032] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615037] usb 3-1.5.7.1: urb complete
[ 389.615042] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615047] usb 3-1.5.7.1: urb complete
[ 389.615051] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615056] usb 3-1.5.7.1: urb complete
[ 389.615060] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615065] usb 3-1.5.7.1: urb complete
[ 389.615070] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615074] usb 3-1.5.7.1: urb complete
[ 389.615079] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615084] usb 3-1.5.7.1: urb complete
[ 389.615089] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615094] usb 3-1.5.7.1: urb complete
[ 389.615098] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615103] usb 3-1.5.7.1: urb complete
[ 389.615108] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615112] usb 3-1.5.7.1: urb complete
[ 389.615117] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.615121] usb 3-1.5.7.1: urb complete
[ 389.615126] usb 3-1.5.7.1: userurb pK-error, ep1 int-in,
actual_length 0 status -108
[ 389.616823] usb 3-1.5.7.1: urb complete
[ 389.616839] usb 3-1.5.7.1: userurb pK-error, ep1 int-out,
actual_length 0 status -108
[ 389.616926] usb 3-1.5.7.1: usbdev_do_ioctl: CLAIMINTERFACE
[ 389.616964] usb 3-1.5.7.1: usbdev_do_ioctl: SETINTERFACE
As you can see, it locked in 215 seconds, in this test. Do you see
anything special in this sequence?
Thanks,
Alberto Sentieri
On 11/16/20 12:06 PM, Alan Stern wrote:
On Mon, Nov 16, 2020 at 11:53:38AM -0500, Alberto Sentieri wrote:
The objective of this email is to report the current status of my findings.
I loaded netconsole on both machines I was having problems with. I tried 3
times on the machine with kernel 5.0.0-37 and twice with on the machine with
kernel 5.3.0-62. Each attempt consisted of running the program which lock
the kernel until it locked (about 3 minutes after stating the program). The
referred program had the "semphore code" commented out. Nothing was sent to
netconsole on all the 5 attempts I made when the kernel locked.
Just to be clear about my use of netconsole, before loading the netconsole
kernel module, I ran "dmesg -n 8". When netconsole module was loaded I could
clearly see about 9 message lines on the computer receiving the netconsole
messages telling me that netconsole was loaded (and how it was configured),
so no doubts about the correct netconsole setup. The "netconsole server" was
a machine on the same local network.
My next attempt will be to compile kernel 5.9, as you suggest, and try it.
While this may not generate any useful information, one way to create a
bunch of log output while running your test is to set the usbfs_snoop
module parameter for usbcore to true.
You can also enable dynamic debugging for usbcore, although in a stable
environment like yours it probably won't produce much output.
Alan Stern