Hi, On Wed, Sep 08, 2021 at 01:17:02AM +0200, David Runge wrote: > Upon first boot the /dev/fw0 and /dev/fw1 nodes did not show up > initially [1] (log without using -1 debug parameter), which is why I > switched the device off and on again. > Afterwards the device nodes were created, so I attempted to use aplay -l > to check whether the device was detected. The command hung forever (also > starting jackd) and I had to reboot eventually. Using alsamixer I was > able to select the Fireface800 device from the list of devices though. > > Upon 2nd boot I used the -1 debug parameter for firewire-ohci [2]. > The device nodes were created automatically, but neither aplay -l nor > alsamixer showed the audio interface. The attempt to switch the device > off and on again and afterwards to remove the firewire-ohci module lead > to another freeze and a successive reboot. > > If I can provide further information, please let me know! > > I hope I can try the vanilla 5.14.1 kernel tomorrow to see how it > behaves in regards to firewire-ohci (judging from the past it will > probably work though). > > Best, > David > > P.S.: I have added Kristian Hoegsberg in CC, who is mentioned as the > author of firewire-ohci and firewire-core. > > [1] https://pkgbuild.com/~dvzrv/bugs/2021/09/linux-5.14.1.19.realtime1-kernel.log > [2] https://pkgbuild.com/~dvzrv/bugs/2021/09/linux-5.14.1.19.realtime1-restart_device_kernel.log ``` kernel: INFO: task pipewire-media-:2554 blocked for more than 122 seconds. kernel: Not tainted 5.14.1.19.realtime1-1-rt #1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: task:pipewire-media- state:D stack: 0 pid: 2554 ppid: 1803 flags:0x00000220 kernel: Call Trace: kernel: __schedule+0x338/0x1580 kernel: ? __mod_zone_page_state+0x81/0xf0 kernel: ? rt_spin_unlock+0x13/0x40 kernel: ? rmqueue_bulk+0x201/0x820 kernel: schedule+0x43/0xf0 kernel: schedule_timeout+0x14d/0x190 kernel: wait_for_completion+0x9e/0x100 kernel: fw_run_transaction+0xd7/0x110 [firewire_core bb04f32445782056ab0dc73199b57a87dc27c06d] ... ``` According to the log, the task of 'pipewire-media-:2554' is blocked during 122 seconds by call of 'wait_for_completion()' in code of 'fw_run_transaction()'. This is odd in two points of transaction service programmed in Linux FireWire subsystem: 1. The process context should be awakened by softIRQ context, which should be scheduled by hwIRQ context for hardware interrupt of OHCI 1394 controller. 2. Even if the softIRQ context is not invoked, the process context should be awakened by wheel timer context, which is scheduled to finish the transaction several jiffies later (originally prepared for the case of split-transaction). In the case, the result of transaction is 'RCODE_CANCELLED'. The call graph is below: 1 in the process context: ``` (drivers/firewire/core-transaction.c) ->fw_run_transaction() ->timer_setup_on_stack() ->fw_send_request() ->timer_setup() ->struct fw_card_driver.send_request() (drivers/firewire/ohci.c) = ohci_send_request() ... ->writel() ->wait_for_completion() (blocked and awakened by softIRQ context later) ->destroy_timer_on_stack() ``` 2 in the hwIRQ context: ``` (drivers/firewire/ohci.c) ->irq_handler() ->tasklet_schedule(&ohci->at_response_ctx.tasklet); ``` 3 in the softIRQ context: ``` (drivers/firewire/core-transaction.c) ->handle_at_packet() ->struct fw_packet.callback() = transmit_complete_callback() ->close_transaction() ->struct fw_transaction.callback() = transaction_callback() ->complete() ``` 4. in the timer wheel context: ``` (drivers/firewire/core-transaction.c) ->split_transaction_timeout_callback() ->struct fw_packet.callback() = transaction_callback() ->complete() ``` I guess we have issue in the softIRQ context or timer wheel context instead of hwIRQ context, if the OHCI 1394 controller you use has no quirk (I suggest you to give information about the controller). I'd like you to check it. The firewire-ohci module has 'debug' parameter. `` $ modinfo firewire-ohci ... parm: debug:Verbose logging (default = 0, AT/AR events = 1, self-IDs = 2, IRQs = 4, busReset events = 8, or a combination, or all = -1) (int) `` When the parameter has value '5' (= 1 | 4), we can see information per transaction, like: ``` kernel: firewire_ohci 0000:0b:00.0: IRQ 00000001 AT_req kernel: firewire_ohci 0000:0b:00.0: AT spd 2 tl 27, ffc1 -> ffc0, ack_pending , QR req, fffff0000400 kernel: firewire_ohci 0000:0b:00.0: IRQ 00000020 AR_resp kernel: firewire_ohci 0000:0b:00.0: AR spd 2 tl 27, ffc0 -> ffc1, ack_complete, QR resp = 04101573 ``` The first line: in hwIRQ context for request subaction The second line: in softIRQ context scheduled by the above. The third line: in hwIRQ context for response subaction The fourth line: in softIRQ context scheduled by the above. If we have any issue in invocation of softIRQ, we would often see missing swIRQ logs (and it might be the most concern to RT Linux developers). If you need instruction to configure module parameters, 'man modprobe.d' may be a good help. Thanks Takashi Sakamoto