Re: 64-bit userspace root file system for hppa64

Guenter Roeck <linux@xxxxxxxxxxxx> · Fri, 8 Dec 2023 06:58:26 -0800

On 12/8/23 00:01, Mark Cave-Ayland wrote:
On 07/12/2023 21:47, Helge Deller wrote:

(looping in Mark Cave-Ayland, since he did some work on qemu esp driver)

Thanks for the ping!

On 12/7/23 22:08, Guenter Roeck wrote:
Hi Helge,

On 12/6/23 13:43, Helge Deller wrote:
On 12/6/23 21:19, Guenter Roeck wrote:
On 12/6/23 09:00, Helge Deller wrote:
[ ... ]
Is it worth testing with multiple CPUs ? I can re-enable it and
check more closely if you think it makes sense. If so, what number
of CPUs would you recommend ?

I think 4 CPUs is realistic.
But I agree, that you probably see more issues.

Generally the assumption was, that the different caches on parisc
may trigger SMP issues, but given that those issues can be seen on
qemu, it indicates that there are generic SMP issues too.


Ok, I ran some tests overnight with 2-8 CPUs. Turns out the system is quite
stable,

cool!

with the exception of SCSI controllers. Some fail completely, others
rarely. Here is a quick summary:

- am53c974 fails with "Spurious irq, sreg=00", followed by "Aborting command"
   and a hung task crash.
- megasas and megasas-gen2 fail with
   "scsi host1: scsi scan: INQUIRY result too short (5), using 36"
   followed by
   "megaraid_sas 0000:00:04.0: Unknown command completed!"
   and a hung task crash
- mptsas1068 fails completely (no kernel log message seen)
- dc390 and lsi* report random "Spurious irq, sreg=00" messages and timeouts

I think none of those drivers have ever been tested
on physical hardware either.
So I'm astonished that it even worked that far :-)

I actually do have a dc390 board somewhere. I used it some time ago to improve
the emulation.

Do you have a physical hppa box too?

Based on kernel sources, the "Spurious irq, sreg=%02x." error can only happen for the
am53c974 driver. Are you sure you see this message for dc390 and lsi* too?

am53c974 and dc390 use the same driver. lsi* doesn't, and doesn't have a problem
either. Sorry, I confused that with some old notes.

Either case, I think I found the problem. After handling an interrupt, the Linux
driver checks if another interrupt is pending. It does that by checking the
DMA_DONE bit in the DMA status register. If that bit is set, it re-enters the
interrupt handler. Problem with that is that the emulation sets DMA_DONE
prematurely, before it sets the command done bit in the interrupt status register
and before it sets the interrupt pending bit in the status register. As result,
DMA_DONE is set but IRQ_PENDING isn't, and the spurious interrupt is reported.
I fixed that up in my code and will test it for some time and with various
architectures before I send a patch.

I'm actually in the process of putting the finishing touches to a large rewrite of QEMU's core ESP emulation since there are a number of known issues with the existing version. In particular there are problems with the SCSI phase being set incorrectly after reading ESP_INTR and ESP_RSTAT's STAT_TC not being correct. Note that this is just the ESP core rather than the ESP PCI device.

If you are interested, I could try and find a few minutes to tidy it up a bit more and push a testing branch to Github?


Sure, I'll be happy to give your changes a try.

FWIW, the change I made to fix the spurious interrupt problem is

diff --git a/hw/scsi/esp-pci.c b/hw/scsi/esp-pci.c
index 6794acaebc..f624398c55 100644
--- a/hw/scsi/esp-pci.c
+++ b/hw/scsi/esp-pci.c
@@ -286,9 +286,6 @@ static void esp_pci_dma_memory_rw(PCIESPState *pci, uint8_t *buf, int len,
     /* update status registers */
     pci->dma_regs[DMA_WBC] -= len;
     pci->dma_regs[DMA_WAC] += len;
-    if (pci->dma_regs[DMA_WBC] == 0) {
-        pci->dma_regs[DMA_STAT] |= DMA_STAT_DONE;
-    }
 }

I tested that with several platforms. There are no more spurious interrupts
after that change, and no other errors either.

Regarding TC after reading the interrupt register, I carry the following
patch locally.

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 9b11d8c573..f0cd8705a7 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -986,7 +986,7 @@ uint64_t esp_reg_read(ESPState *s, uint32_t saddr)
          */
         val = s->rregs[ESP_RINTR];
         s->rregs[ESP_RINTR] = 0;
-        s->rregs[ESP_RSTAT] &= ~STAT_TC;
+        // s->rregs[ESP_RSTAT] &= ~STAT_TC;

The comment above that code says "Clear sequence step, interrupt register
and all status bits except TC", which is quite the opposite of what the code
is doing because it clears TC and nothing else. I never spent the time
trying to figure out how to fix that properly; clearing the other bits
like the comment suggests doesn't work (STAT_INT needs to be set for
esp_lower_irq() to work, and clearing the other bits results in transfer
failures).

Thanks,
Guenter