Re: firewire-ohci fails to initialize Texas Instruments XIO2213A/B/XIO2221 based controller on realtime kernels [5.4.91-rt50, 5.10.8-rt24]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, Sep 08, 2021 at 01:17:02AM +0200, David Runge wrote:
> Upon first boot the /dev/fw0 and /dev/fw1 nodes did not show up
> initially [1] (log without using -1 debug parameter), which is why I
> switched the device off and on again.
> Afterwards the device nodes were created, so I attempted to use aplay -l
> to check whether the device was detected. The command hung forever (also
> starting jackd) and I had to reboot eventually. Using alsamixer I was
> able to select the Fireface800 device from the list of devices though.
> 
> Upon 2nd boot I used the -1 debug parameter for firewire-ohci [2].
> The device nodes were created automatically, but neither aplay -l nor
> alsamixer showed the audio interface. The attempt to switch the device
> off and on again and afterwards to remove the firewire-ohci module lead
> to another freeze and a successive reboot.
> 
> If I can provide further information, please let me know!
> 
> I hope I can try the vanilla 5.14.1 kernel tomorrow to see how it
> behaves in regards to firewire-ohci (judging from the past it will
> probably work though).
> 
> Best,
> David
> 
> P.S.: I have added Kristian Hoegsberg in CC, who is mentioned as the
> author of firewire-ohci and firewire-core.
> 
> [1] https://pkgbuild.com/~dvzrv/bugs/2021/09/linux-5.14.1.19.realtime1-kernel.log
> [2] https://pkgbuild.com/~dvzrv/bugs/2021/09/linux-5.14.1.19.realtime1-restart_device_kernel.log

```
kernel: INFO: task pipewire-media-:2554 blocked for more than 122 seconds.
kernel:       Not tainted 5.14.1.19.realtime1-1-rt #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: task:pipewire-media- state:D stack:    0 pid: 2554 ppid:  1803 flags:0x00000220
kernel: Call Trace:
kernel:  __schedule+0x338/0x1580
kernel:  ? __mod_zone_page_state+0x81/0xf0
kernel:  ? rt_spin_unlock+0x13/0x40
kernel:  ? rmqueue_bulk+0x201/0x820
kernel:  schedule+0x43/0xf0
kernel:  schedule_timeout+0x14d/0x190
kernel:  wait_for_completion+0x9e/0x100
kernel:  fw_run_transaction+0xd7/0x110 [firewire_core bb04f32445782056ab0dc73199b57a87dc27c06d]
...
```

According to the log, the task of 'pipewire-media-:2554' is blocked during
122 seconds by call of 'wait_for_completion()' in code of
'fw_run_transaction()'. This is odd in two points of transaction service
programmed in Linux FireWire subsystem:

1. The process context should be awakened by softIRQ context, which should
   be scheduled by hwIRQ context for hardware interrupt of OHCI 1394
   controller.
2. Even if the softIRQ context is not invoked, the process context
   should be awakened by wheel timer context, which is scheduled to finish
   the transaction several jiffies later (originally prepared for the case
   of split-transaction). In the case, the result of transaction is
   'RCODE_CANCELLED'.

The call graph is below:

1 in the process context:

```
(drivers/firewire/core-transaction.c)
->fw_run_transaction()
  ->timer_setup_on_stack()
  ->fw_send_request()
    ->timer_setup()
    ->struct fw_card_driver.send_request()
    (drivers/firewire/ohci.c)
    = ohci_send_request()
      ...
      ->writel()
  ->wait_for_completion()
    (blocked and awakened by softIRQ context later)
  ->destroy_timer_on_stack()
```

2 in the hwIRQ context:

```
(drivers/firewire/ohci.c)
->irq_handler()
  ->tasklet_schedule(&ohci->at_response_ctx.tasklet);
```

3 in the softIRQ context:

```
(drivers/firewire/core-transaction.c)
->handle_at_packet()
  ->struct fw_packet.callback()
  = transmit_complete_callback()
    ->close_transaction()
      ->struct fw_transaction.callback()
      = transaction_callback()
        ->complete()
```

4. in the timer wheel context:

```
(drivers/firewire/core-transaction.c)
->split_transaction_timeout_callback()
  ->struct fw_packet.callback()
  = transaction_callback()
    ->complete()
```

I guess we have issue in the softIRQ context or timer wheel context
instead of hwIRQ context, if the OHCI 1394 controller you use has
no quirk (I suggest you to give information about the controller).

I'd like you to check it. The firewire-ohci module has 'debug' parameter.

``
$ modinfo firewire-ohci
...
parm:           debug:Verbose logging (default = 0, AT/AR events = 1, self-IDs = 2, IRQs = 4, busReset events = 8, or a combination, or all = -1) (int)
``

When the parameter has value '5' (= 1 | 4), we can see information per
transaction, like:

```
kernel: firewire_ohci 0000:0b:00.0: IRQ 00000001 AT_req
kernel: firewire_ohci 0000:0b:00.0: AT spd 2 tl 27, ffc1 -> ffc0, ack_pending , QR req, fffff0000400
kernel: firewire_ohci 0000:0b:00.0: IRQ 00000020 AR_resp
kernel: firewire_ohci 0000:0b:00.0: AR spd 2 tl 27, ffc0 -> ffc1, ack_complete, QR resp = 04101573
```

The first line:  in hwIRQ context for request subaction
The second line: in softIRQ context scheduled by the above.
The third line:  in hwIRQ context for response subaction
The fourth line: in softIRQ context scheduled by the above.

If we have any issue in invocation of softIRQ, we would often see
missing swIRQ logs (and it might be the most concern to RT Linux
developers).

If you need instruction to configure module parameters, 'man modprobe.d'
may be a good help.


Thanks

Takashi Sakamoto



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux