Re: Need help to debug ata errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021/12/06 20:18, Ayan Kumar Halder wrote:
> Hi Damien,
> 
> Thanks a lot for your inputs.
> 
> On 06/12/2021 00:12, Damien Le Moal wrote:
>> On 2021/12/03 20:11, Ayan Kumar Halder wrote:
>>> Hi All,
>>>
>>> I am trying to run linux as a DomU guest on Xen with AHCI assigned to it.
>>> I can confirm that SATA works (ie able to detect sdb) as a Dom0 guest.
>>> However, it does not work as a DomU guest.
>>>
>>> Hardware :- ZCU102 board and it has two sata ports
>>> Kernel :- 5.10
>>>
>>> I have enabled the debug logs in drivers/ata
>>>
>>> 1. Logs from dom0 (where SATA works) https://pastebin.com/2BhMDq47
>>> 2. Logs from domU (where SATA does not work) https://pastebin.com/fE8WZnZ0
>>>
>>> Can some help me to answer these questions
>>> 1. What does this mean "1st FIS failed" ?
>>>
>>> 2. In the dom0 logs, PORT_SCR_ERR = 0x41d0002 whereas in domU logs,
>>> PORT_SCR_ERR = 0. Does it give some hints ?
>>>
>>> 3. Any other issues or hints to debug this ?
>>>
>>> I can confirm that in domU scenario, we do not get any interrupts from
>>> the device. What might be going wrong here ?
>>
>> That would be the first thing to check since without interrupts you will not get
>> any command completion. Commands will timeout and probe will not work.
>> And this IRQ problem is Xen territory, not ata.
> 
> I am actually debugging the interrupts from the Xen's side. I can 
> confirm that do_IRQ() (Xen's irq handler) does not receive AHCI 
> interrupts. It does get invoked for interrupts from serial and other 
> devices.
> 
> I have seen commands being timed out which is due to the iRQ issue. But 
> suprisingly, ahci probe is successful.

That cannot be. Without any interrupt, there will be no command completion.
Command that timeout are retried. So you may have seen timeouts because the
platform or device is very slow to respond, but you must be getting interrupts
if you get a good device probe. Otherwise, you would not see any disk connected
to your ports.

>>
>> The 1st FIS failed error may be due to some problems with AHCI PCI bar/register
>> accesses, which may not be working. This I think points again to Xen setup with
>> domU, which may not have the necessary access rights to get IRQ and PCI bar
>> accesses ? (I have no experience with Xen)
> 
> This is the device tree https://pastebin.com/HtdLx63v . I think it is 
> not related to PCI bus. Please correct me if mistaken.

Well, since you have an ahci node, I do not think that adapter is behind the PCI
bus :) It is a child of the axi bus. Not familiar with that type of setup...
Are you sure all properties of the ahci node are correct ?

> I have the necessary debug support from Xen. Can you let me know what 
> bits I can debug from SATA side (for eg reading a particular register) 
> which will confirm if SATA has been programmed correctly or not ?

The device probe with domU should be no different than what it is with dom0, I
think. Again, I do not have experience with Xen, so not entirely sure.

Note that from the dmesg you sent, for the working case, the port seems to be
awfully slow to link up. Not sure if that is normal for this platform.


-- 
Damien Le Moal
Western Digital Research



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux