Re: 64-bit userspace root file system for hppa64

Guenter Roeck <linux@xxxxxxxxxxxx> · Fri, 8 Dec 2023 14:39:23 -0800

On 12/8/23 13:19, Mark Cave-Ayland wrote:
On 08/12/2023 19:56, Guenter Roeck wrote:

On 12/8/23 10:53, Mark Cave-Ayland wrote:
On 08/12/2023 14:58, Guenter Roeck wrote:

On 12/8/23 00:01, Mark Cave-Ayland wrote:
On 07/12/2023 21:47, Helge Deller wrote:

(looping in Mark Cave-Ayland, since he did some work on qemu esp driver)

Thanks for the ping!

On 12/7/23 22:08, Guenter Roeck wrote:
Hi Helge,

On 12/6/23 13:43, Helge Deller wrote:
On 12/6/23 21:19, Guenter Roeck wrote:
On 12/6/23 09:00, Helge Deller wrote:
[ ... ]
Is it worth testing with multiple CPUs ? I can re-enable it and
check more closely if you think it makes sense. If so, what number
of CPUs would you recommend ?

I think 4 CPUs is realistic.
But I agree, that you probably see more issues.

Generally the assumption was, that the different caches on parisc
may trigger SMP issues, but given that those issues can be seen on
qemu, it indicates that there are generic SMP issues too.


Ok, I ran some tests overnight with 2-8 CPUs. Turns out the system is quite
stable,

cool!

with the exception of SCSI controllers. Some fail completely, others
rarely. Here is a quick summary:

- am53c974 fails with "Spurious irq, sreg=00", followed by "Aborting command"
   and a hung task crash.
- megasas and megasas-gen2 fail with
   "scsi host1: scsi scan: INQUIRY result too short (5), using 36"
   followed by
   "megaraid_sas 0000:00:04.0: Unknown command completed!"
   and a hung task crash
- mptsas1068 fails completely (no kernel log message seen)
- dc390 and lsi* report random "Spurious irq, sreg=00" messages and timeouts

I think none of those drivers have ever been tested
on physical hardware either.
So I'm astonished that it even worked that far :-)

I actually do have a dc390 board somewhere. I used it some time ago to improve
the emulation.

Do you have a physical hppa box too?

Based on kernel sources, the "Spurious irq, sreg=%02x." error can only happen for the
am53c974 driver. Are you sure you see this message for dc390 and lsi* too?

am53c974 and dc390 use the same driver. lsi* doesn't, and doesn't have a problem
either. Sorry, I confused that with some old notes.

Either case, I think I found the problem. After handling an interrupt, the Linux
driver checks if another interrupt is pending. It does that by checking the
DMA_DONE bit in the DMA status register. If that bit is set, it re-enters the
interrupt handler. Problem with that is that the emulation sets DMA_DONE
prematurely, before it sets the command done bit in the interrupt status register
and before it sets the interrupt pending bit in the status register. As result,
DMA_DONE is set but IRQ_PENDING isn't, and the spurious interrupt is reported.
I fixed that up in my code and will test it for some time and with various
architectures before I send a patch.

I'm actually in the process of putting the finishing touches to a large rewrite of QEMU's core ESP emulation since there are a number of known issues with the existing version. In particular there are problems with the SCSI phase being set incorrectly after reading ESP_INTR and ESP_RSTAT's STAT_TC not being correct. Note that this is just the ESP core rather than the ESP PCI device.

If you are interested, I could try and find a few minutes to tidy it up a bit more and push a testing branch to Github?


Sure, I'll be happy to give your changes a try.

FWIW, the change I made to fix the spurious interrupt problem is

diff --git a/hw/scsi/esp-pci.c b/hw/scsi/esp-pci.c
index 6794acaebc..f624398c55 100644
--- a/hw/scsi/esp-pci.c
+++ b/hw/scsi/esp-pci.c
@@ -286,9 +286,6 @@ static void esp_pci_dma_memory_rw(PCIESPState *pci, uint8_t *buf, int len,
      /* update status registers */
      pci->dma_regs[DMA_WBC] -= len;
      pci->dma_regs[DMA_WAC] += len;
-    if (pci->dma_regs[DMA_WBC] == 0) {
-        pci->dma_regs[DMA_STAT] |= DMA_STAT_DONE;
-    }
  }

I tested that with several platforms. There are no more spurious interrupts
after that change, and no other errors either.

I suspect that this is papering over the real issue, since it appears the code being removed sets the DMA completion bit when then the PCI DMA transfer counter reaches zero.


DMA_STAT_DONE is also set in esp_pci_command_complete(), so it doesn't get lost.

That doesn't seem right from a QEMU perspective: the command_complete callback is invoked when the SCSI layer has completed its data transfer to the emulated device, or immediately if there is no data phase. From a DMA perspective triggering an interrupt when the byte counter is zero feels like it should be the correct behaviour.

Problem is that the Linux kernel driver assumes that the interrupt status bit
is set in parallel with DMA_STAT_DONE. The spurious interrupt is seen because
that is not the case. There may be a better solution, of course. I'll be happy
to give it a try if you find a better solution.

Could you provide a github link to the file/line in question so I can have a look?


Assuming you mean the Linux kernel:

In esp_scsi.c:

The interrupt loop is in scsi_esp_intr(). It calls esp->ops->irq_pending(esp))
to check if another interrupt is pending. Subsequently, it calls __esp_interrupt()
to handle it. __esp_interrupt() calls esp_check_spur_intr(), which expects ESP_INTR_SR
to be set.

The irq_pending function is am53c974.c:pci_esp_irq_pending(). It checks the DMA status
register and assumes that an interrupt is pending if any of (ESP_DMA_STAT_ERROR |
ESP_DMA_STAT_ABORT | ESP_DMA_STAT_DONE | ESP_DMA_STAT_SCSIINT) is set in the DMA
status register.

Hope this helps,

Guenter